I'm new to Pandas. I have a data frame that looks something like this.
| Name | Storage Location | Total Quantity |
|---|---|---|
| a | S1 | 100 |
| a | S2 | 200 |
| a | S3 | 300 |
| a | S4 | 110 |
| a | S5 | 200 |
| b | S1 | 200 |
| b | S2 | 300 |
| b | S4 | 400 |
| b | S5 | 150 |
| c | S1 | 400 |
| c | S5 | 500 |
I wanna sum the "Total Quantity" group by the Name and also specific storage location which are only "S1,S2,S3".
| Name | Total Quantity |
|---|---|
| a | 600 |
| b | 500 |
| c | 400 |
My desired output would be something like the above. Kindly appreciate for you guys help. Thank you in advance!
CodePudding user response:
You could use where to replace the unwanted Locations with NaN and use groupby sum (since sum skips NaN by default):
out = df.where(df['Storage Location'].isin(['S1','S2','S3'])).groupby('Name', as_index=False)['Total Quantity'].sum()
Output:
Name Total Quantity
0 a 600.0
1 b 500.0
2 c 400.0
CodePudding user response:
Use:
In [2378]: out = df[df['Storage Location'].isin(['S1', 'S2', 'S3'])].groupby('Name')['Total Quantity'].sum().reset_index()
In [2379]: out
Out[2379]:
Name Total Quantity
0 a 600
1 b 500
2 c 400
