Home > database >  Python Pandas - how to add columns of filtered sum and calculate percentage weight
Python Pandas - how to add columns of filtered sum and calculate percentage weight

Time:02-06

I would like to add two columns to a pandas df to show daily totals and weights. Existing df:

Date Name Value
2022-2-1 Apple 5
2022-2-1 Pear 3
2022-2-2 Apple 4
2022-2-2 Pear 4
2022-2-2 Banana 2
2022-2-3 Apple 5
2022-2-3 Banana 3
2022-2-4 Apple 4
2022-2-4 Banana 2
2022-2-4 Papaya 9

Desired output:

Date Name Value Daily Total Percentage Weight
2022-2-1 Apple 5 8 62.5%
2022-2-1 Pear 3 8 37.5%
2022-2-2 Apple 4 10 40%
2022-2-2 Pear 4 10 40%
2022-2-2 Banana 2 10 20%
2022-2-3 Apple 5 8 62.5%
2022-2-3 Banana 3 8 37.5%
2022-2-4 Apple 4 15 26.7%
2022-2-4 Banana 2 15 13.3%
2022-2-4 Papaya 9 15 60%

Appreciate any ideas/suggestions.

CodePudding user response:

Use groupby transform('sum'):

df['Daily Total'] = df.groupby('Date')['Value'].transform('sum')
df['Percentage Weight'] = (df['Value'] / df['Daily Total'] * 100).round(1).map('{}%'.format)

Output:

>>> df
       Date    Name  Value  Daily Total Percentage Weight
0  2022-2-1   Apple      5            8             62.5%
1  2022-2-1    Pear      3            8             37.5%
2  2022-2-2   Apple      4           10             40.0%
3  2022-2-2    Pear      4           10             40.0%
4  2022-2-2  Banana      2           10             20.0%
5  2022-2-3   Apple      5            8             62.5%
6  2022-2-3  Banana      3            8             37.5%
7  2022-2-4   Apple      4           15             26.7%
8  2022-2-4  Banana      2           15             13.3%
9  2022-2-4  Papaya      9           15             60.0%

One-liner:

df = df.assign(DailyTotal=df.groupby('Date')['Value'].transform('sum')).pipe(lambda x: x.assign(PercentageWeight=df['Value'].div(df['Daily Total']).mul(100).round(1).map('{}%'.format)))
  •  Tags:  
  • Related