I have a pandas dataframe (called result), which looks something like this:
| event_1 | event_2 | event_3 |
|---|---|---|
| 1 | 1 | 1 |
| 1 | 1 | 1 |
| 1 | Del | 1 |
| 1 | 1 | 1 |
And I would like to remove all the rows before the one in which there is the value Del. So that the result would look like this:
| event_1 | event_2 | event_3 |
|---|---|---|
| 1 | Del | 1 |
| 1 | 1 | 1 |
I tried adapting some code I found in some other posts, but it doesn't seem to do the trick (it actually runs for a lot, and never stops to run).
result.groupby('event_1').apply(lambda x: x.loc[(x.event_2 == "Del").idxmax():,:]).reset_index(drop=True)
CodePudding user response:
You can use boolean slicing:
df[df['event_2'].eq('Del').cummax()]
CodePudding user response:
if we have 2 or more occurrences and if you want the last one to consider, you can try below :
Example :
A=[1,2,3,'del',5,6,7]
B=[1,2,3,4,5,6,7]
C=[1,2,'del',4,5,6,7]
df=pd.DataFrame([B,A,C]).T
df.columns=list('ABC')
df
A B C
0 1 1 1
1 2 2 2
2 3 3 del
3 4 del 4
4 5 5 5
5 6 6 6
6 7 7 7
ind=df[df.eq('del').any(1)].index.max() df=df.iloc[ind:].reset_index(drop=True)df
A B C
0 4 del 4
1 5 5 5
2 6 6 6
3 7 7 7
