I have a large collection of JSON documents that has many entries of the following format (contents in my example are important to my question):
doc1
{
"data": [{
"entry": {
"fieldA": "aaa",
"fieldB": "xxx"
}
},
{
"entry": {
"fieldA": "ccc",
"fieldB": "yyy"
}
},
{
"entry": {
"fieldA": "eee",
"fieldB": "xxx"
}
}
]
}
doc2
{
"data": [{
"entry": {
"fieldA": "aaa",
"fieldB": "xxx"
}
},
{
"entry": {
"fieldA": "ccc",
"fieldB": "yyy"
}
},
{
"entry": {
"fieldA": "eee",
"fieldB": "nnn"
}
}
]
}
...
docN
{
"data": [{
"entry": {
"fieldA": "aaa",
"fieldB": "yyy"
}
},
{
"entry": {
"fieldA": "ccc",
"fieldB": "yyy"
}
},
{
"entry": {
"fieldA": "eee",
"fieldB": "xxx"
}
}
]
}
What I want to do is create a query that follows the below rule:
Only returns documents where it has a fieldA that contains aaa and has another entry where fieldA contains eee AND where the fieldB of those entries have values that match.
In the above example, that would be the first top level document as the fieldB for both sub entries are xxx
Additionally it would be nice to have just the entries pruned in the returned document, instead of the whole document
CodePudding user response:
I hpoe this will return as you expected
[{
$unwind: {
path: '$data'
}
}, {
$match: {
$or: [
{
'data.entry.fieldA': 'aaa'
},
{
'data.entry.fieldA': 'eee'
}
]
}
}, {
$group: {
_id: '$_id',
data: {
$push: {
fieldA: '$data.entry.fieldA',
fieldB: '$data.entry.fieldB'
}
}
}
}]
CodePudding user response:
One option without unwinding and grouping again is:
- Clean your data as requested.
$matchonly wanted documents
db.collection.aggregate([
{$project: {
data: {$filter: {
input: "$data",
cond: {$or: [
{$eq: ["$$this.entry.fieldA", "aaa"]},
{$eq: ["$$this.entry.fieldA", "eee"]}
]}
}}
}},
{$match: {
$expr: {$and: [
{$eq: [{$size: "$data"}, 2]},
{$eq: [{$first: "$data.entry.fieldB"}, {$last: "$data.entry.fieldB"}]}
]}
}}
])
See how it works on the playground example
