Home > Back-end >  Using jq to process json with control characters
Using jq to process json with control characters

Time:02-01

I have the following json file (output.json) with control characters in it (line break, tabs, etc):

{"data”:{“gherkin”:”Given user successful login
And status is '<currentStatus>'
When user clicks '<nextStatus>’
Then status message should change to '<message>'
    Examples:
        | currentStatus | nextStatus    | message       |
        | READY         | PROCESS   | ready to process |
        | PROCESS       | COMPLETE  | ready to complete |
"}}

I need to get the value from "gherkin" field and write it into another file keeping the same format as in the original json.

When using jq command:

jq .data.gherkin output.json

it throws an error:

parse error: Invalid string: control characters from U 0000 through U 001F must be escaped at line 9, column 1

If I remove all control characters from output.json, I will lose the original format of the value of "gherkin" field. Is there a way to accomplish this using jq?

Thanks!

CodePudding user response:

With your input,

sed 's/$/\\n/' | tr -d '\n' | sed -e 's/“/"/g' -e 's/”/"/g' | sed '$ s/\\n$//' | jq .

yields:

{
  "data": {
    "gherkin": "Given user successful login\nAnd status is '<currentStatus>'\nWhen user clicks '<nextStatus>’\nThen status message should change to '<message>'\n    Examples:\n        | currentStatus | nextStatus    | message       |\n        | READY         | PROCESS   | ready to process |\n        | PROCESS       | COMPLETE  | ready to complete |\n"
  }
}

The point being that once you have valid JSON, you can use jq or any other JSON-oriented tool.

CodePudding user response:

As the message suggests, there's no need to remove them; you just need to escape them. For example, byte 0A could be replaced with \u000a. That particular one could also be replaced with \n.

This can be used to fix up your input:

perl -pe's/[\x00-\x1F]/ sprintf "\\uX", ord $& /eg'

Specifying file to process to Perl one-liner

So, you could chain the two.

perl -pe's/[\x00-\x1F]/ sprintf "\\uX", ord $& /eg' output.json |
   jq .data.gherkin 
  •  Tags:  
  • Related