Home > Software design >  Pandas row sum for values > 0
Pandas row sum for values > 0

Time:02-02

I have a dataframe of the following format

ROW   Value1    Value2  Value3    Value4
1      10        10      -5        -2
2      50        20      -10       -7
3      10        5        0        -1

I am looking to calculate for each row the sum of positive totals and sum of negative totals. So essentially, the resulting frame should look like

ROW   Post_Total    Neg_Total
1      20            -7
2      70            -17
3      15            -1

One thing I have in my dataset, a column can have only positive or negative values.

Any ideas on how this can be done. I tried subsetting by >0 but was not successful. Thanks!

CodePudding user response:

Use:

df1 = (df.melt('ROW')
         .assign(g = lambda x: np.where(x['value'].gt(0),'Pos_Total','Neg_Total'))
         .pivot_table(index='ROW',columns='g', values='value', aggfunc='sum', fill_value=0)
         .reset_index()
         .rename_axis(None, axis=1))
print (df1)
   ROW  Neg_Total  Pos_Total
0    1         -7         20
1    2        -17         70
2    3         -1         15

CodePudding user response:

You could use:

(df.melt(id_vars='ROW')
   .assign(sign=lambda d: np.where(d['value'].gt(0), 'Pos_Total', 'Neg_Total'))
   .groupby(['ROW', 'sign'])['value'].sum()
   .unstack('sign')
)

Or alternatively, using masks:

d = df.set_index('ROW')
mask = d.gt(0)
pd.DataFrame({'Pos_Total': d.where(mask).sum(1),
              'Neg_Total': d.mask(mask).sum(1)})

output:

     Pos_Total  Neg_Total
ROW                      
1         20.0       -7.0
2         70.0      -17.0
3         15.0       -1.0

CodePudding user response:

Since all columns can either have all positive or all negative, you can use all() to check for condition along the columns, then groupby:

df.groupby(df.gt(0).all(), axis=1).sum()

Output:

      False  True 
ROW              
1       -7     20
2      -17     70
3       -1     15

In general, I'll just subset/clip and sum:

out = pd.DataFrame({'pos': df.clip(lower=0).sum(1),
                    'neg': df.clip(upper=0).sum(1)
                  })

CodePudding user response:

Let us try apply

out = df.set_index('ROW').apply(lambda x : {'Pos':x[x>0].sum(),'Neg':x[x<0].sum()} ,
                                           result_type = 'expand',
                                           axis=1)
Out[33]: 
     Pos  Neg
ROW          
1     20   -7
2     70  -17
3     15   -1
  •  Tags:  
  • Related