Home > Net >  How to calculate cumulative missing values group by an ID (in python)?
How to calculate cumulative missing values group by an ID (in python)?

Time:01-24

a) given the following "id" and "freq"

df = pd.DataFrame({'id':[1,1,1,1,1,1,2,2,2,2,2,3,3,3],'freq':[1,2,np.NaN, np.NaN, np.NaN, 6,7,8,9,10,np.NaN,np.NaN,13,14]})

df

b) how to calculate cumulative missing of "freq" group by "id"? with a reset to zero when freq > 0

so that the result 'cum_null' should look like - print(df(['cum_null']) 0 0 1 2 3 0 0 0 0 0 1 1 0 0

c) I've tried this. Very close, but cannot reset to zero when freq > 0

df['cum_null'] = id_grp['freq'].apply(lambda x:x.isnull().astype(int).cumsum())

df

CodePudding user response:

If you case you can do groupby with mask

df['cum_null'] = df.freq.isnull().groupby(df['id']).cumsum().where(df.freq.isnull(),0)
0     0
1     0
2     1
3     2
4     3
5     0
6     0
7     0
8     0
9     0
10    1
11    1
12    0
13    0
Name: freq, dtype: int64
  •  Tags:  
  • Related