Iss #154 - Encode remaining hanging esc char in JSON for ECA hook script
Hanging escape characters in JSON data is not considered valid and causes issues downstream on attempted validation. This change replaces them with a safe encoded version that represents the same content.
Resolves #154 (closed)
Merge request reports
Activity
requested review from @zacharysabourin, @cguindon, @epoirier, and @oliviergoulet
149 149 :commits => processed_git_data, 150 150 :strictMode => is_strict_enforced 151 151 } 152 ## Generate request (use gsub to scrub any lingering \n constants) 153 response = HTTParty.post("#{API_URL}/git/eca", :body => MultiJson.dump(json_data).gsub(/(\\n|\\r)/, ''), 152 ## Generate request (use gsub to scrub any lingering \n constants and encode escape characters) 153 response = HTTParty.post("#{API_URL}/git/eca", :body => MultiJson.dump(json_data).gsub(/(\\n|\\r)/, '').gsub(/\\+/, '%5c'), I would expect MultiJson.dump(json_data) to be able to encode characters and return a valid JSON object. Are we sure the issue is not with the API service rather than this hook?
We should be able to determine this by validating the output of MultiJson.dump(json_data) with a JSON validation service such as https://jsonlint.com/
Yes I'm sure, as I linted the json as well haha! The first gsub would leave hanging
\
characters in some cases, which would leave unknown escape characters\sequences. While there are ways to squelch the errors and handle it with some server-side settings, that is not considered valid JSON.Using your link, if you try the following which is a basic reproduction of our case, it will fail with a weird message
{ "sample": "sample with \ in body" }
If you remove the backslash, then it validates as JSON immediately. This also will cover the cases where someone randomly throws the characters in as well, which will remove an edge case we luckily hadn't seen before
Edited by Martin LoweI did a quick test this morning and wrote a quick script:
require 'multi_json' json_data = { :provider => "sample with \ in body" } puts MultiJson.dump(json_data)
ruby test.rb > output.txt
The output is a valid JSON.
I just made another test script to see if I could reproduce your issue if I include
sample with \ in body
in a commit message:require 'multi_json' commit = { :body => `git show -s --format='%B b20cba93f2f7daf6287bbe4b987cbe2c0f12a2e6'`.force_encoding("utf-8"), } json_data = { :provider => commit } puts MultiJson.dump(json_data)
commit b20cba93f2f7daf6287bbe4b987cbe2c0f12a2e6 (HEAD -> master) Author: Christopher Guindon <chris.guindon@eclipse-foundation.org> Date: Tue Jan 9 09:02:50 2024 -0500 Initial commit sample with \ in body
The output is still a valid JSON string output.txt:
You are saying that if you include,
"sample with \ in body"
in your ruby hash, MultiJson.dump() will not escape the slash?This was the sample provided yesterday before we had the call, I haven't had the chance to actually build you a case end to end and look more into this yet. That was an example of what the output could result in which fails as that exact JSON object is invalid. I'll be looking more into this today once I have a moment
Finally got back to this, apologies for delay. I removed extra processing and made a worst case scenario to test, and it produced the below JSON which is valid according to parsers online. I checked against staging as well, and it was able to handle the whitespace characters that appeared in some of the fields cleanly as well. I think your point of just letting the lib handle it makes sense here.
{"repoUrl":"","provider":"gitlab","commits":[{"author":{"name":"Martin Lowe\n","mail":"martin.lowe@eclipse-foundation.org\n"},"committer":{"name":"Martin Lowe\n","mail":"martin.lowe@eclipse-foundation.org\n"},"body":"Test\n\n'\\r\\n' test\n\n","subject":"Test\n","hash":"6061367d09fda1e021a3bc9fc4f2bf678927503b\n","parents":["689816256c3774e5ab5588f40a1dde761780cc25"]}],"strictMode":true}
changed this line in version 2 of the diff
added 1 commit
- 138dfa45 - Update ECA hook script to remove extra JSON processing
added 1 commit
- dc8437f1 - Update ECA hook script to remove extra JSON processing
mentioned in commit a1abe0db