I have a dataframe:
df =
| No. | Scenario | Exe Seq | Action |
|---|---|---|---|
| 1 | A | 1 | a |
| 2 | A | 2 | b |
| 3 | A | 3 | c |
| 4 | A | 1 | a |
| 5 | A | 2 | b |
| 6 | A | 1 | a |
Those are same scenarios, but some reach three, but some stop at two or one. I want to distinguish this.
The "Scenario" values may have values other than "A"
So I will get:
| No. | Scenario | Exe Seq | Action | New_Scenario |
|---|---|---|---|---|
| 1 | A | 1 | a | A_1 |
| 2 | A | 2 | b | A_1 |
| 3 | A | 3 | c | A_1 |
| 4 | A | 1 | a | A_2 |
| 5 | A | 2 | b | A_2 |
| 6 | A | 1 | a | A_3 |
CodePudding user response:
IIUC use:
#sequence start if consecutive differencies if not 1
df['New_Scenario'] = df['Scenario'] '_' df['Exe Seq'].diff().ne(1).cumsum().astype(str)
print (df)
Or:
#sequence start by 1
df['New_Scenario'] = df['Scenario'] '_' df['Exe Seq'].eq(1).cumsum().astype(str)
Or maybe:
#sequence start if consecutive differencies if less like 0
df['New_Scenario'] = (df['Scenario'] '_'
df['Exe Seq'].diff().fillna(-1).le(0).cumsum().astype(str))
