Skip to content
Snippets Groups Projects

Iss #154 - Encode remaining hanging esc char in JSON for ECA hook script

Merged Iss #154 - Encode remaining hanging esc char in JSON for ECA hook script
1 unresolved thread
Merged Martin Lowe requested to merge (removed):malowe/main/154 into main
1 unresolved thread

Hanging escape characters in JSON data is not considered valid and causes issues downstream on attempted validation. This change replaces them with a safe encoded version that represents the same content.

Resolves #154 (closed)

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
149 149 :commits => processed_git_data,
150 150 :strictMode => is_strict_enforced
151 151 }
152 ## Generate request (use gsub to scrub any lingering \n constants)
153 response = HTTParty.post("#{API_URL}/git/eca", :body => MultiJson.dump(json_data).gsub(/(\\n|\\r)/, ''),
152 ## Generate request (use gsub to scrub any lingering \n constants and encode escape characters)
153 response = HTTParty.post("#{API_URL}/git/eca", :body => MultiJson.dump(json_data).gsub(/(\\n|\\r)/, '').gsub(/\\+/, '%5c'),
  • I would expect MultiJson.dump(json_data) to be able to encode characters and return a valid JSON object. Are we sure the issue is not with the API service rather than this hook?

    We should be able to determine this by validating the output of MultiJson.dump(json_data) with a JSON validation service such as https://jsonlint.com/

  • Author Maintainer

    Yes I'm sure, as I linted the json as well haha! The first gsub would leave hanging \ characters in some cases, which would leave unknown escape characters\sequences. While there are ways to squelch the errors and handle it with some server-side settings, that is not considered valid JSON.

    Using your link, if you try the following which is a basic reproduction of our case, it will fail with a weird message

    {
        "sample": "sample with \ in body"
    }

    If you remove the backslash, then it validates as JSON immediately. This also will cover the cases where someone randomly throws the characters in as well, which will remove an edge case we luckily hadn't seen before

    Edited by Martin Lowe
  • You are saying that if you include, "sample with \ in body" in your ruby hash, MultiJson.dump() will not escape the slash?

  • I did a quick test this morning and wrote a quick script:

    require 'multi_json'
    
    json_data = {
      :provider => "sample with \ in body"
    }
    
    puts MultiJson.dump(json_data)

    ruby test.rb > output.txt

    output.txt

    The output is a valid JSON.

  • I just made another test script to see if I could reproduce your issue if I include sample with \ in body in a commit message:

    require 'multi_json'
    
    commit = {
      :body => `git show -s --format='%B b20cba93f2f7daf6287bbe4b987cbe2c0f12a2e6'`.force_encoding("utf-8"),
    }
    
    
    json_data = {
      :provider => commit
    }
    
    puts MultiJson.dump(json_data)
    commit b20cba93f2f7daf6287bbe4b987cbe2c0f12a2e6 (HEAD -> master)
    Author: Christopher Guindon <chris.guindon@eclipse-foundation.org>
    Date:   Tue Jan 9 09:02:50 2024 -0500
    
        Initial commit
        
        sample with \ in body

    The output is still a valid JSON string output.txt:

  • Author Maintainer

    You are saying that if you include, "sample with \ in body" in your ruby hash, MultiJson.dump() will not escape the slash?

    This was the sample provided yesterday before we had the call, I haven't had the chance to actually build you a case end to end and look more into this yet. That was an example of what the output could result in which fails as that exact JSON object is invalid. I'll be looking more into this today once I have a moment

  • Author Maintainer

    Finally got back to this, apologies for delay. I removed extra processing and made a worst case scenario to test, and it produced the below JSON which is valid according to parsers online. I checked against staging as well, and it was able to handle the whitespace characters that appeared in some of the fields cleanly as well. I think your point of just letting the lib handle it makes sense here.

    {"repoUrl":"","provider":"gitlab","commits":[{"author":{"name":"Martin Lowe\n","mail":"martin.lowe@eclipse-foundation.org\n"},"committer":{"name":"Martin Lowe\n","mail":"martin.lowe@eclipse-foundation.org\n"},"body":"Test\n\n'\\r\\n' test\n\n","subject":"Test\n","hash":"6061367d09fda1e021a3bc9fc4f2bf678927503b\n","parents":["689816256c3774e5ab5588f40a1dde761780cc25"]}],"strictMode":true}
  • Martin Lowe changed this line in version 2 of the diff

    changed this line in version 2 of the diff

  • Please register or sign in to reply
  • Martin Lowe added 1 commit

    added 1 commit

    • 138dfa45 - Update ECA hook script to remove extra JSON processing

    Compare with previous version

  • Christopher Guindon approved this merge request

    approved this merge request

  • Martin Lowe added 1 commit

    added 1 commit

    • dc8437f1 - Update ECA hook script to remove extra JSON processing

    Compare with previous version

  • merged

  • Martin Lowe mentioned in commit a1abe0db

    mentioned in commit a1abe0db

  • Please register or sign in to reply
    Loading