Home > Software engineering >  Pandas: Groupby, Apply And Repeat
Pandas: Groupby, Apply And Repeat

Time:01-23

I have a Pandas DataFrame for which I would like to calculate some weighted means, with respect to a group given by a column 'Class'.

import pandas as pd
import numpy as np
df_test = pd.DataFrame.from_dict({
          "Class":["A","A","A","B","B","B"],
          "X":[0, 1, 2, 3, 4, 5],
          "Y":[0, 1, 2, 3, 4, 5],
          "Z":[0, 1, 2, 3, 4, 5],
          "W":[1, 1, 1, 2, 2, 2],
         })

def GetWMean(group):
    Q = group[["X", "Y", "Z"]]
    W = group["W"]
    Wms = W.dot(Q)/W.sum()
    return Wms

WMs = df_test.groupby("Class").apply(lambda x: GetWMean(x))

I would like it so that, like in a transform, I get three new columns with the value I calculated, repeated for each row belonging to the group. i.e. each row used in the apply function has the weighted mean I calculated for the group, repeated for all the rows.

How can I achieve this?

CodePudding user response:

IIUC, just join them afterwards:

df_test.join(WMs, on=df_test.Class, rsuffix="_WMean")

Output:

  Class  X  Y  Z  W  X_WMean  Y_WMean  Z_WMean
0     A  0  0  0  1      1.0      1.0      1.0
1     A  1  1  1  1      1.0      1.0      1.0
2     A  2  2  2  1      1.0      1.0      1.0
3     B  3  3  3  2      4.0      4.0      4.0
4     B  4  4  4  2      4.0      4.0      4.0
5     B  5  5  5  2      4.0      4.0      4.0
  •  Tags:  
  • Related