Home > OS >  How to perform group by operation in Pandas?
How to perform group by operation in Pandas?

Time:01-20

Get count of transactions by region where customers had greater than 10000$ sales and less than 10000$ sales. (hint: create 2 columns for getting count of transaction ids - one where customers had greater than 10000 $ sales and another where customers had less than 10000 $ sales)

Dataset

I am having trouble figuring out how to go about this problem as transaction_id has all unique values and how do I groupby region in Pandas

df_3 = dataset.groupby(['region', 'transaction_id'], as_index=False)['sales'].sum()
df_3

above code give the following output

and then from df_3 I got the sales values >10,000 and <10000 But I don't know how to get count of transactions by region

CodePudding user response:

I hope this is the solution you are finding. Do upvote and accept solution if it help.

dataset.loc[dataset["sales"] < 10000, "10k_above"] = 0
dataset.loc[dataset["sales"] >= 10000, "10k_above"] = 1

df_results = dataset.groupby(by=["region"], as_index=False).agg(
    transaction_count = ("transaction_id", "count"),
    above_10k_count = ("10k_above", "sum")
    below_10k_count = ("10k_above", lambda x: (x==0).sum())
)
  •  Tags:  
  • Related