I have recently shifted to MongoDB and Mongoose with Node.js. And I am wrapping my head around it all coming from SQL.
I have a collection where documents have a similar structure to the following:
{
name: String
rank: Number
}
Sometimes the name might be the same, but the rank will always be different.
I would like to remove all duplicates of name, but retain the object that has the LOWEST rank.
For instance, if my collection looked like this:
{
name: "name1"
rank: 3
},
{
name: "name1"
rank: 4
},
{
name: "name1"
rank: 2
}
I would like to remove all objects where name is the same except for:
{
name: "name1"
rank: 2
}
Is this possible to do with mongoose?
CodePudding user response:
Here is my approach:
const found = await db.collection.aggregate([
{
$group: {
_id: "$name",
minRank: {
$min: "$rank"
}
}
},
])
await db.collection.deleteMany({
$or: found.map(item => ({
name: item._id,
rank: { $ne: item.minRank }
}))
})
Explanation:
From my point of view your solution would result in many unnecessary calls being made, which would result in a terrible time of execution. My solution exactly contains two steps:
- find for each document's property
namethe corresponding lowestrankavailable. - delete each document, where the name is equal to one of those names and the rank is not equal to the actual lowest rank found.
Additional notes:
If not already done, you should probably define an index on the name property of your schema for performance reasons.
CodePudding user response:
Okay, I figured it out using aggregate:
const duplicates = await collectionName.aggregate([
{
$group: {
_id: "$name",
dups: { $addToSet: "$_id" },
count: { $sum: 1 }
}
},
{
$match: {
count: { $gt: 1 }
}
}
]);
duplicates.forEach(async (item) => {
const duplicate_names = item.dups;
const duplicate_name = await collectionName.find({ _id: { $in: duplicate_names } }).sort({ rank: 1 });
duplicate_name.shift();
duplicate_name.forEach(async (item) => {
await collectionName.deleteOne({ _id: item._id });
});
});
