I have the following dataframe:
Steps
0 False
1 False
2 True
3 True
4 True
5 False
6 True
7 True
8 False
9 False
10 False
11 True
I would like to number the True sequences in an additional column:
Steps Numbered
0 False 0
1 False 0
2 True 1
3 True 1
4 True 1
5 False 0
6 True 2
7 True 2
8 False 0
9 False 0
10 False 0
11 True 3
Filling the rows containing False is secondary. Do you have any ideas?
CodePudding user response:
You can use the values that are both True and different from the previous one (using diff) to initiate a cumsum:
df['Numbered'] = (df['Steps']&df['Steps'].diff()).cumsum().where(df['Steps'], 0)
output:
Steps Numbered
0 False 0
1 False 0
2 True 1
3 True 1
4 True 1
5 False 0
6 True 2
7 True 2
8 False 0
9 False 0
10 False 0
11 True 3
CodePudding user response:
Chain shifted values by Series.shift by & for bitwise AND for counter by first Trues and then set False rows to 0 by Series.where:
df['Numbered'] = ((df['Steps'] & ~df['Steps'].shift(fill_value=False)).cumsum()
.where(df['Steps'], 0))
print (df)
Steps Numbered
0 False 0
1 False 0
2 True 1
3 True 1
4 True 1
5 False 0
6 True 2
7 True 2
8 False 0
9 False 0
10 False 0
11 True 3
Solution working well if first value is True:
df['Numbered'] = ((df['Steps'] & ~df['Steps'].shift(fill_value=False)).cumsum()
.where(df['Steps'], 0))
print (df)
Steps Numbered
0 True 1
1 False 0
2 True 2
3 True 2
4 True 2
5 False 0
6 True 3
7 True 3
8 False 0
9 False 0
10 False 0
11 True 4
