Home > database >  How to efficiently update conditional set of columns in large dataframe
How to efficiently update conditional set of columns in large dataframe

Time:01-25

Given a large pandas data frame with columns Column_1 ... Column_n and a threshold X. What is an efficient way to set all subsequent columns to X once X is breached in each row. I.e. let's say we have threshold 1 and data frame the original dataframe is as in the link:

enter image description here

then the updated dataframe should be as in the second link:

enter image description here

Many thanks in advance!

CodePudding user response:

Try:

df = pd.DataFrame({'A': [0.1, 1.1, 0.4], 'B': [0.3, 0.0, 0.3],
                   'C': [1.2, -2.0, -0.1], 'D': [5.0, 4.0, 0.0]})

out = df.gt(1).mask(lambda x: x == False).ffill(axis=1).astype(float).fillna(df)
print(out)

# Output
     A    B    C    D
0  0.1  0.3  1.0  1.0
1  1.0  1.0  1.0  1.0
2  0.4  0.3 -0.1  0.0
  •  Tags:  
  • Related