Proportion of total in pd.NamedAgg-CodePudding

I have a df like this:

Foo	Bar
A	54
A	95
B	49
A	25
B	25
B	32
B	55
A	98
A	83
B	90

I am computing some statistics using [aggfunc][1]:

temp = pd.DataFrame.from_dict({'Foo': {0: 'A',
  1: 'A',
  2: 'B',
  3: 'A',
  4: 'B',
  5: 'B',
  6: 'B',
  7: 'A',
  8: 'A',
  9: 'B'},
 'Bar': {0: 54, 1: 95, 2: 49, 3: 25, 4: 25, 5: 32, 6: 55, 7: 98, 8: 83, 9: 90}})

res_df = temp.groupby('Foo').agg(
    max_val = pd.NamedAgg(column='Bar', aggfunc='max'),
    min_val = pd.NamedAgg(column='Bar', aggfunc='min'),
    std_val = pd.NamedAgg(column='Bar', aggfunc='std')
)

Which results (correctly) in:

Foo	max_val	min_val	std_val
A	98	25	31.0403
B	90	25	25.37124

However, I want to add another column which is proportion of total, so for example:

Foo	max_val	min_val	std_val	pct_total
A	98	25	31.0403	0.59
B	90	25	25.37124	0.41

How can I accomplish this using aggfunc? [1]: https://pandas.pydata.org/docs/reference/api/pandas.pivot_table.html

CodePudding user response：

You could try as follows:

res_df = temp.groupby('Foo').agg(
    max_val = pd.NamedAgg(column='Bar', aggfunc='max'),
    min_val = pd.NamedAgg(column='Bar', aggfunc='min'),
    std_val = pd.NamedAgg(column='Bar', aggfunc='std'),
    pct_total = pd.NamedAgg(column='Bar', aggfunc=lambda x: x.sum()/temp.Bar.sum())
)

print(res_df)

     max_val  min_val    std_val  pct_total
Foo                                        
A         98       25  31.040296   0.585809
B         90       25  25.371244   0.414191

Or use round(x.sum()/temp.Bar.sum(),2) if you want them rounded: 0.59, 0.41.