Skip to content
Snippets Groups Projects

Add message to the hook script to notify of potential validation delay

Merged Martin Lowe requested to merge malowe/main/script-large-set-messages into main
2 unresolved threads
  • Add message to the hook script to notify of potential validation delay

In the case of large commit sets, there is a chance that the commit set validation will timeout. While not a perfect solution, this should at least inform the user while performance is improved.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
165 168 parsed_response = MultiJson.load(response.body)
166 169 rescue MultiJson::ParseError
167 170 puts "GL-HOOK-ERR: Unable to validate commit, server error encountered.\n\nPlease contact the administrator, and retry the commit at a later time.\n\n"
171 ## If a large commit set, this could be a timeout, and may be resolved through a second attempt after a short delay
172 if (diff_git_commits.length() > LARGE_COMMIT_SET_THRESHOLD) then
  • Why are we printing 2 messages?

  • Author Maintainer

    One is to inform them before the request that it'll probably take some time and is expected. The second is when there's an error and we had a ton of commits that it may have been a timeout, as another push will likely succeed after a minute or so

  • Gotcha, the second one only occurs if there is an error...

    I am concerned this may create too much noise, what do you think about only including the message when we detect an error?

    Do we know who is timing out? Is it the API or the Ruby script making the HTTP call?

    My understanding is that httparty uses the default from Net::HTTP which has a default value of 60 seconds.

  • Author Maintainer

    My assumption is Quarkus kills the connection, looks like somewhere between 15s and 30s. I'm not finding anything in configs about that value, but that about tracks from what I can see. I'm hoping to put in a large-ish thread pool over the next week and have requests grab from that to do parallel processing rather than sequential. 15s is a long wait for a validation, and I know we can do better.

  • Author Maintainer

    The vert.x client has a really clear timeout of 60s, which is what we're starting to switch to this quarter. I can't find any reference to the legacy resteasy client timeouts in the config docs though.

  • When we validate multiple commits, do we currently make individual API calls to either the foundation or profile API for each commit a user makes, or have we optimized this to validate a user only once?

    One idea to consider is creating a new endpoint that allows us to retrieve the ECA status (and any other necessary information) for multiple users simultaneously. This could help reduce the number of API requests required.

    Another suggestion is to enhance our Ruby script so that it automatically retries requests after encountering a timeout. However, this could lead to an even longer wait time so not best solution.

    tldr; While increasing the timeout may be something we should do, we should prioritize ways to optimize our processes to shorten the time required for commit validation

  • It would be interesting to have some stats for this API on where are the biggest bottlenecks.

  • Author Maintainer

    Biggest bottleneck is without a doubt sequential processing of commits. We use somewhat aggressive caching, so people should only be fetched once, but we fetch their profiles as we need ECA status, which is the second biggest issue. So if there is a few users on a patch, it adds up fast. Add in that it's sequential and we have a recipe for slow responses.

    One idea to consider is creating a new endpoint that allows us to retrieve the ECA status (and any other necessary information) for multiple users simultaneously. This could help reduce the number of API requests required.

    This would be a huge win but would also probably need an at least somewhat complicated rewrite. Without a new endpoint, something that could actually make a lot of sense here would be looking into reactive processing, as it's made to pick up and drop processing as it's waiting for responses, but that would take much longer than an endpoint as I'm a novice at reactive.

    tldr; While increasing the timeout may be something we should do, we should prioritize ways to optimize our processes to shorten the time required for commit validation

    That's honestly the plan. Making it wait longer doesn't fix that it gets really slow sometimes.

  • Author Maintainer

    I am concerned this may create too much noise, what do you think about only including the message when we detect an error?

    I only include the first message if there are more than 15 commits in the patch, which will probably take at least 10 seconds, more if there are a lot of unique committers. This might make a user wonder whats taking so long, so this is to just let them know it's somewhat expected. This is just a quick fix for a rare case while we look into performance. With the changes I'm looking into now along with switching to native, I'm hoping to get a lot of performance gains. My hope is that we won't need that message after all the changes are in.

    If you want, I can move the first message to go out with second message if you'd prefer. Just let me know!

  • I am good with what you have!

  • Please register or sign in to reply
  • Christopher Guindon approved this merge request

    approved this merge request

  • Martin Lowe mentioned in commit 82b51a5b

    mentioned in commit 82b51a5b

  • merged

  • Please register or sign in to reply
    Loading