Add message to the hook script to notify of potential validation delay
- Add message to the hook script to notify of potential validation delay
In the case of large commit sets, there is a chance that the commit set validation will timeout. While not a perfect solution, this should at least inform the user while performance is improved.
Merge request reports
Activity
requested review from @cguindon, @malowe, and @zacharysabourin
@cguindon this is an interim fix that would allow us to at least inform the user in cases where there may be long processing that may timeout. I am planning over the next 2 weeks to see if we can improve the processing time here, though it may take some time to roll out successfully. As this is a highly used service, I want to do some internal stress testing to ensure that it doesn't break/slow down under medium/heavy loads.
Don't forget to link to the issue
165 168 parsed_response = MultiJson.load(response.body) 166 169 rescue MultiJson::ParseError 167 170 puts "GL-HOOK-ERR: Unable to validate commit, server error encountered.\n\nPlease contact the administrator, and retry the commit at a later time.\n\n" 171 ## If a large commit set, this could be a timeout, and may be resolved through a second attempt after a short delay 172 if (diff_git_commits.length() > LARGE_COMMIT_SET_THRESHOLD) then Gotcha, the second one only occurs if there is an error...
I am concerned this may create too much noise, what do you think about only including the message when we detect an error?
Do we know who is timing out? Is it the API or the Ruby script making the HTTP call?
My understanding is that httparty uses the default from Net::HTTP which has a default value of 60 seconds.
My assumption is Quarkus kills the connection, looks like somewhere between 15s and 30s. I'm not finding anything in configs about that value, but that about tracks from what I can see. I'm hoping to put in a large-ish thread pool over the next week and have requests grab from that to do parallel processing rather than sequential. 15s is a long wait for a validation, and I know we can do better.
When we validate multiple commits, do we currently make individual API calls to either the foundation or profile API for each commit a user makes, or have we optimized this to validate a user only once?
One idea to consider is creating a new endpoint that allows us to retrieve the ECA status (and any other necessary information) for multiple users simultaneously. This could help reduce the number of API requests required.
Another suggestion is to enhance our Ruby script so that it automatically retries requests after encountering a timeout. However, this could lead to an even longer wait time so not best solution.
tldr; While increasing the timeout may be something we should do, we should prioritize ways to optimize our processes to shorten the time required for commit validation
Biggest bottleneck is without a doubt sequential processing of commits. We use somewhat aggressive caching, so people should only be fetched once, but we fetch their profiles as we need ECA status, which is the second biggest issue. So if there is a few users on a patch, it adds up fast. Add in that it's sequential and we have a recipe for slow responses.
One idea to consider is creating a new endpoint that allows us to retrieve the ECA status (and any other necessary information) for multiple users simultaneously. This could help reduce the number of API requests required.
This would be a huge win but would also probably need an at least somewhat complicated rewrite. Without a new endpoint, something that could actually make a lot of sense here would be looking into reactive processing, as it's made to pick up and drop processing as it's waiting for responses, but that would take much longer than an endpoint as I'm a novice at reactive.
tldr; While increasing the timeout may be something we should do, we should prioritize ways to optimize our processes to shorten the time required for commit validation
That's honestly the plan. Making it wait longer doesn't fix that it gets really slow sometimes.
I am concerned this may create too much noise, what do you think about only including the message when we detect an error?
I only include the first message if there are more than 15 commits in the patch, which will probably take at least 10 seconds, more if there are a lot of unique committers. This might make a user wonder whats taking so long, so this is to just let them know it's somewhat expected. This is just a quick fix for a rare case while we look into performance. With the changes I'm looking into now along with switching to native, I'm hoping to get a lot of performance gains. My hope is that we won't need that message after all the changes are in.
If you want, I can move the first message to go out with second message if you'd prefer. Just let me know!
mentioned in commit 82b51a5b