I have a pandas dataframe with the columns username and phase. I want to create a separate column called count with incremental values.
The count will be based on how many times a username has appeared in a specific phase. How can I accomplish this efficiently? Any suggestion is appreciated.
username phase count
0 andrew 1 1
1 andrew 1 2
2 alex 1 1
3 alex 2 1
4 andrew 1 3
5 cindy 3 1
6 alex 2 2
CodePudding user response:
You can use cumcount after groupby on username and phase.
df['count'] = df.groupby(['username', 'phase']).cumcount() 1
print(df)
username phase count
0 andrew 1 1
1 andrew 1 2
2 alex 1 1
3 alex 2 1
4 andrew 1 3
5 cindy 3 1
6 alex 2 2
