I have multiple JSON files like these:
{
"object1": {
"tags": ["A"],
"something": "else",
"other": "data"
},
"object2": {
"tags": ["A", "B"]
}
}
and
{
"object3": {
"tags": ["C"],
"something": "else",
"other": "data"
},
"object4": {
"tags": ["A"]
}
}
It is guaranteed that keys of all the objects (object1 - object4) are unique, across all files.
I need to generate a different json file, that would be an array of used tags, and each tag would have extra information which objects use it:
[
{
"tag": "A",
"objects": ["object1", "object2", "object4"]
},
{
"tag": "B",
"objects": ["object2"]
},
{
"tag": "C",
"objects": ["object3"]
}
]
Order of tags in this output array is irrelevant.
So far I have: cat *.json | jq -s add | jq '[.[].tags[]] | unique' which gives me array of tags used across all files, but I don't quite know how to get the list of objects for those tags. I suspect that this is not a right approach, because I am loosing some information (source of the tag) during this transformation.
Any help with jq expressions would be appreciated. Thank you.
CodePudding user response:
One way would be to reduce the input using to_entries
jq -s '
add | reduce to_entries[] as $e ({}; .[$e.value.tags[]] = [$e.key])
' *.json
which would give you a structure like this
{
"A": [
"object1",
"object2",
"object4"
],
"B": [
"object2"
],
"C": [
"object3"
]
}
To then convert this into your desired structure, append another to_entries
jq -s '
add | reduce to_entries[] as $e ({}; .[$e.value.tags[]] = [$e.key])
| to_entries | map({tag:.key, objects:.value})
' *.json
[
{
"tag": "A",
"objects": [
"object1",
"object2",
"object4"
]
},
{
"tag": "B",
"objects": [
"object2"
]
},
{
"tag": "C",
"objects": [
"object3"
]
}
]
