I have a pandas dataframe as follows:
Date Week Value1 Value2 Value3
2022-01-01 1 -10 20 30
2022-01-02 1 -5 25 20
2022-01-03 1 0 15 NaN
2022-01-04 1 5 7 10
2022-01-05 1 7 10 15
2022-01-06 1 10 5 NaN
I am looking to perform a cumulative sum such that the resulting DF is as follows
Date Week Value1 Value2 Value3
2022-01-03 1 -15 60 50
2022-01-05 1 22 22 25
Essentially Value3 has NaN values. No other column has it. I am looking to total up all values for the 3 Value columns between each NaN encountered in Value3. I am also looking to keep Date and Week of the row where I encountered the NaN value as is (so cumsum is applied only to Value columns)
I have tried so far (some variations of the below) but w/o success.
df.groupby(['Date','Week'])['Value1', 'Value2','Value3'].apply(lambda x: x.isna().cumsum().reset_index(drop=True))
But havent got the desired result using this. Any ideas on how this can be achieved? Thanks!
CodePudding user response:
We use a greoupby on a cumulative number of NaNs in Value3:
df.groupby(df['Value3'].shift().isna().cumsum()).agg({'Date':'last', 'Week':'last', 'Value1':'sum', 'Value2':'sum', 'Value3':'sum'}).reset_index(drop = True)
output:
Date Week Value1 Value2 Value3
0 2022-01-03 1 -15 60 50.0
1 2022-01-06 1 22 22 25.0
