Home > Mobile >  Count of 1s in a Pandas DataFrame
Count of 1s in a Pandas DataFrame

Time:01-06

After converting a column into 1s and Os based on some criteria using list comprehension and .apply():

x = [0 if df["COLUMN_NAME"][i] > 0 else 1 for i in range(len(df))]
df["PASS"] = df.apply(lambda row: x)

I need to figure out what the longest sequence of 1s I have in my column is, but I can't figure out an easy way to do it. Can someone help me out here. Thanks.

CodePudding user response:

You could adapt this to your problem, as you already have the column in list form.

Longest sequence of consecutive duplicates in a python list

CodePudding user response:

Try: (see at the bottom to understand the method)

import pandas as pd
import numpy as np

np.random.seed(2022)
df = pd.DataFrame({'COL': np.random.choice([0, 1], 20)})

ls1 = df.eq(0).cumsum().loc[df['COL'] == 1].value_counts().max()

Output:

>>> ls1
6

>>> df
    COL
0     1
1     0
2     1
3     0
4     1
5     1
6     0
7     1
8     0
9     0
10    0
11    0
12    1  # <- 1
13    1  # <- 2
14    1  # <- 3
15    1  # <- 4
16    1  # <- 5
17    1  # <- 6
18    0
19    0

Partial result to understand the method:

>>> pd.concat([df, df.eq(0).cumsum().loc[df['COL'] == 1]], axis=1)
    COL  COL
0     1  0.0  # Group 0 contains 1 consecutive 1
1     0  NaN
2     1  1.0  # Group 1 contains 1 consecutive 1
3     0  NaN
4     1  2.0  # Group 2 contains 2 consecutive 1
5     1  2.0
6     0  NaN
7     1  3.0  # Group 3 contains 1 consecutive 1
8     0  NaN
9     0  NaN
10    0  NaN
11    0  NaN
12    1  7.0  # Group 7 contains 6 consecutive 1
13    1  7.0
14    1  7.0
15    1  7.0
16    1  7.0
17    1  7.0
18    0  NaN
19    0  NaN
  •  Tags:  
  • Related