Home > Enterprise >  MongoDB Aggregation to get count and Y sample entries
MongoDB Aggregation to get count and Y sample entries

Time:01-11

MongoDB version:4.2.17.

Trying out aggregation on data in a collection.

Example data:

 {
        "_id" : "244",
        "pubName" : "p1",
        "serviceIdRef" : "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
        "serviceName" : "my-service",
        "subName" : "c1",
        "pubState" : "INVITED"
 }

I would like to:

Do a match by something (let’s say subName) and group by serviceIdRef and then limit to return X entries Also return for each of the serviceIdRefs, the count of the documents in each of ACTIVE or INVITED states. And Y (for this example, say Y=3) documents that are in this state. For example, the output would appear as (in brief):

[
    {
        serviceIdRef: "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
        serviceName:
        state:[
            {
                pubState: "INVITED"   
                count: 200
                sample: [ // Get those Y entries (here Y=3)
                    {
                        // sample1 like:
                        "_id" : "244",
                        "pubName" : "p1",
                        "serviceIdRef" : "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
                        "serviceName" : "my-service",
                        "subName" : "c1",
                        "pubState" : "INVITED"

                    },
                    {
                        sample2
                    },
                    {
                        sample3
                    }
                ]
            },
            {
                pubState: "ACTIVE", // For this state, repeat as we did for "INVITED" state above.
                ......
            }
        ]
    }
    {
        repeat for another service
    }
]

So far I have written this but am not able to get those Y entries. Is there a (better) way?

This is what I have so far (not complete and not exactly outputs in the format above):

db.sub.aggregate(
    [{
        $match:
        {
            "subName": {
                $in: ["c1", "c2"]
    
            },
            
            "$or": [
                {
                    "pubState": "INVITED",
                },
                {
                    "pubState": "ACTIVE",
                }
            ]
        }
    },
    {
        $group: {
            _id: "$serviceIdRef",
            subs: {
                $push: "$$ROOT",
    
            }
    
        }
    },
    {
        $sort: {
            _id: -1,
        }
    },
    {
        $limit: 22
    },
    {
       $facet:
        {
            facet1: [
                {
                    $unwind: "$subs",
                },
                {
                    $group:
                    {
                        _id: {
                            "serviceName" : "$_id",
                            "pubState": "$subs.pubState",
                            "subState": "$subs.subsState"
                        },
                        count: {
                            $sum: 1
                        }
                            
                    }
                }
            ]
        }
    }
    
    ])
    

CodePudding user response:

You have to do the second $group stage to manage nested structure,

  • $match your conditions
  • $sort by _id in descending order
  • $group by serviceIdRef and pubState, get first required fields and prepare the array for sample, and get count of documents
  • $group by only serviceIdRef and construct the state array
  • $slice for limit the document in sample
db.collection.aggregate([
  {
    $match: {
      subName: { $in: ["c1", "c2"] },
      pubState: { $in: ["INVITED", "ACTIVE"] }
    }
  },
  { $sort: { _id: -1 } },
  {
    $group: {
      _id: {
        serviceIdRef: "$serviceIdRef",
        pubState: "$pubState"
      },
      serviceName: { $first: "$serviceName" },
      sample: { $push: "$$ROOT" },
      count: { $sum: 1 }
    }
  },
  {
    $group: {
      _id: "$_id.serviceIdRef",
      serviceName: { $first: "$serviceName" },
      state: {
        $push: {
          pubState: "$_id.pubState",
          count: "$count",
          sample: { $slice: ["$sample", 22] }
        }
      }
    }
  }
])

Playground

  •  Tags:  
  • Related