I have the following documents:
_id: "Team 1"
count: 1200
_id: "Team 2"
count: 1170
_id: "Team 3"
count: 1006
_id: "Team 4"
count: 932
_id: "Team 5"
count: 931
_id: "Team 6"
count: 899
_id: "Team 7"
count: 895
The list is already sorted and everything, I just need to project this as an array of top 5 based on count and then the rest should be summed as 'others'. If possible I'd like to also add the percentage that each element in the list makes up of the full count. Like this:
[
{"name":"Team 1", "count":1200, "percent":25},
{"name":"Team 2", "count":1170,"percent":15},
{"name":"Team 3", "count":1006,"percent":10},
{"name":"Team 4", "count":932,"percent":5},
{"name":"Team 5", "count":931,"percent":5},
{"name":"Other", "count":1794, "percent":40}]
]
CodePudding user response:
Query
$setWindowFieldsto sort and add the sort-rank to each document- group by
nullwith 3 accumulators- push the first 5 documents unchanged
- sum the count of the rest (rank>5)
- total sum
$mapto divide the counts with the total sum for the 5 top documents, to get the percentage also- add also the percentage for the rest of documents
- unwind and replace the root, with those documents that have count and percentage
Playmongo (put the mouse at the end of each stage to see the stage in and out)
aggregate(
[{"$setWindowFields":
{"output": {"rank": {"$rank": {}}}, "sortBy": {"count": -1}}},
{"$group":
{"_id": null,
"top5":
{"$push": {"$cond": [{"$lte": ["$rank", 5]}, "$$ROOT", "$$REMOVE"]}},
"other": {"$sum": {"$cond": [{"$lte": ["$rank", 5]}, 0, "$count"]}},
"all": {"$sum": "$count"}}},
{"$project":
{"_id": 0,
"docs":
{"$concatArrays":
[{"$map":
{"input": "$top5",
"in":
{"name": "$$this._id",
"count": "$$this.count",
"percentage":
{"$multiply": [{"$divide": ["$$this.count", "$all"]}, 100]}}}},
[{"name": "other",
"count": "$other",
"percentage":
{"$multiply": [{"$divide": ["$other", "$all"]}, 100]}}]]}}},
{"$unwind": "$docs"}, {"$replaceRoot": {"newRoot": "$docs"}}])
CodePudding user response:
If you have mongoDB version 5.0 or higher you can use $setWindowFields like in @Takis nice answer. Otherwise, you can group, $slice and $reduce your way to the answer:
$sortto have the highest count on top and group to put them all in one array calledalland to$sumup.$slicetheallarray to keep only the top N.$reducethe top N to sum them up.- Add the
othersto the top N array with countsum-sum(topN) $unwindand format
db.collection.aggregate([
{$sort: {count: -1}},
{$group: {_id: null, all: {$push: "$$ROOT"}, sum: {$sum: "$count"}}},
{$project: {_id: null, sum: 1, res: {$slice: ["$all", 5]}}},
{$project: {sum: 1, res: 1, topN: {
$reduce: {
input: "$res",
initialValue: 0,
in: {$add: ["$$value", "$$this.count"]}
}
}
}
},
{
$project: {_id: 0, sum: 1, res: {
$concatArrays: [
[{_id: "other", count: {$subtract: ["$sum", "$topN"]}}],
"$res"
]
}
}
},
{$unwind: "$res"},
{$project: {_id: "$res._id", count: "$res.count",
percent: { $round: [{$multiply:
[{$divide: ["$res.count", "$sum"]}, 100]}, 0]
}
}
}
])
CodePudding user response:
another way to do it using $facet since $setWindowFields only works with mongodb v5 or later
db.collection.aggregate([
{ $sort: { count: -1 } },
{
"$facet": {
others: [
{ "$skip": 5 },
{
"$group": {
"_id": "others",
"count": { "$sum": "$count" }
}
}
],
top5: [ { "$limit": 5 } ]
}
},
{
"$project": { result: { "$concatArrays": [ "$others", "$top5" ] } }
},
{
"$addFields": { totalCount: { "$sum": "$result.count" } }
},
{ $unwind: "$result" },
{
$project: {
_id: "$result._id",
count: "$result.count",
percent: {
$round: [
{ "$multiply": [ { $divide: [ "$result.count", "$totalCount" ] }, 100 ] },
0
]
}
}
}
])
