After converting a column into 1s and Os based on some criteria using list comprehension and .apply():
x = [0 if df["COLUMN_NAME"][i] > 0 else 1 for i in range(len(df))]
df["PASS"] = df.apply(lambda row: x)
I need to figure out what the longest sequence of 1s I have in my column is, but I can't figure out an easy way to do it. Can someone help me out here. Thanks.
CodePudding user response:
You could adapt this to your problem, as you already have the column in list form.
Longest sequence of consecutive duplicates in a python list
CodePudding user response:
Try: (see at the bottom to understand the method)
import pandas as pd
import numpy as np
np.random.seed(2022)
df = pd.DataFrame({'COL': np.random.choice([0, 1], 20)})
ls1 = df.eq(0).cumsum().loc[df['COL'] == 1].value_counts().max()
Output:
>>> ls1
6
>>> df
COL
0 1
1 0
2 1
3 0
4 1
5 1
6 0
7 1
8 0
9 0
10 0
11 0
12 1 # <- 1
13 1 # <- 2
14 1 # <- 3
15 1 # <- 4
16 1 # <- 5
17 1 # <- 6
18 0
19 0
Partial result to understand the method:
>>> pd.concat([df, df.eq(0).cumsum().loc[df['COL'] == 1]], axis=1)
COL COL
0 1 0.0 # Group 0 contains 1 consecutive 1
1 0 NaN
2 1 1.0 # Group 1 contains 1 consecutive 1
3 0 NaN
4 1 2.0 # Group 2 contains 2 consecutive 1
5 1 2.0
6 0 NaN
7 1 3.0 # Group 3 contains 1 consecutive 1
8 0 NaN
9 0 NaN
10 0 NaN
11 0 NaN
12 1 7.0 # Group 7 contains 6 consecutive 1
13 1 7.0
14 1 7.0
15 1 7.0
16 1 7.0
17 1 7.0
18 0 NaN
19 0 NaN
