I have a dataframe of the following format
ROW Value1 Value2 Value3 Value4
1 10 10 -5 -2
2 50 20 -10 -7
3 10 5 0 -1
I am looking to calculate for each row the sum of positive totals and sum of negative totals. So essentially, the resulting frame should look like
ROW Post_Total Neg_Total
1 20 -7
2 70 -17
3 15 -1
One thing I have in my dataset, a column can have only positive or negative values.
Any ideas on how this can be done. I tried subsetting by >0 but was not successful. Thanks!
CodePudding user response:
Use:
df1 = (df.melt('ROW')
.assign(g = lambda x: np.where(x['value'].gt(0),'Pos_Total','Neg_Total'))
.pivot_table(index='ROW',columns='g', values='value', aggfunc='sum', fill_value=0)
.reset_index()
.rename_axis(None, axis=1))
print (df1)
ROW Neg_Total Pos_Total
0 1 -7 20
1 2 -17 70
2 3 -1 15
CodePudding user response:
You could use:
(df.melt(id_vars='ROW')
.assign(sign=lambda d: np.where(d['value'].gt(0), 'Pos_Total', 'Neg_Total'))
.groupby(['ROW', 'sign'])['value'].sum()
.unstack('sign')
)
Or alternatively, using masks:
d = df.set_index('ROW')
mask = d.gt(0)
pd.DataFrame({'Pos_Total': d.where(mask).sum(1),
'Neg_Total': d.mask(mask).sum(1)})
output:
Pos_Total Neg_Total
ROW
1 20.0 -7.0
2 70.0 -17.0
3 15.0 -1.0
CodePudding user response:
Since all columns can either have all positive or all negative, you can use all() to check for condition along the columns, then groupby:
df.groupby(df.gt(0).all(), axis=1).sum()
Output:
False True
ROW
1 -7 20
2 -17 70
3 -1 15
In general, I'll just subset/clip and sum:
out = pd.DataFrame({'pos': df.clip(lower=0).sum(1),
'neg': df.clip(upper=0).sum(1)
})
CodePudding user response:
Let us try apply
out = df.set_index('ROW').apply(lambda x : {'Pos':x[x>0].sum(),'Neg':x[x<0].sum()} ,
result_type = 'expand',
axis=1)
Out[33]:
Pos Neg
ROW
1 20 -7
2 70 -17
3 15 -1
