I have a dataframe with repeating ID as below
| ID | of_flag | os_flag |
|---|---|---|
| 1 | 0 | 1 |
| 1 | 1 | 0 |
| 2 | 1 | 0 |
| 2 | 0 | 1 |
| 2 | 0 | 0 |
I need to create updated column 'of_flag_up' and 'os_flag_up' based on the of_flag and os_flag columns respectively which looks for the first occurance of 1 within each ID group and fill 1 for the rest of the rows as below
| ID | of_flag | os_flag | of_flag_up | os_flag_up |
|---|---|---|---|---|
| 1 | 0 | 1 | 0 | 1 |
| 1 | 1 | 0 | 1 | 1 |
| 2 | 1 | 0 | 1 | 0 |
| 2 | 0 | 1 | 1 | 1 |
| 2 | 0 | 0 | 1 | 1 |
Any help on this would be appreciated. Thanks
CodePudding user response:
You can groupby "ID" and find the cumulative max for each group using cummax and assign them to new columns:
df[['of_flag_up','os_flag_up']] = df.groupby('ID')[['of_flag', 'os_flag']].cummax()
Output:
ID of_flag os_flag of_flag_up os_flag_up
0 1 0 1 0 1
1 1 1 0 1 1
2 2 1 0 1 0
3 2 0 1 1 1
4 2 0 0 1 1
