Home > Software design >  Remove rows from dataframe based on condition on many different columns
Remove rows from dataframe based on condition on many different columns

Time:01-28

I have a DataFrame with 200 columns.
Some of the rows in columns between 10 and 180 have values between -1 and 0.

I need to remove all rows with these values, but only if they occur in columns between 100 and 180. If these values occur in columns 10 to 99 it is fine and I keep them.

I was thinking to use something like:

df[~df[['col100', 'col101',..., 'col180']].isin([-1, 0]).any(1)]

However, I cannot specify all the column names by hand. What is the right way to do this operation?

CodePudding user response:

Use loc accessor

If you need between -1 and 0, use;

df[~df.loc[:,'col100':'col180'].apply(lambda x: x.between(-1,0)).any(1)]

If you need just to check if they have -1 and 0 use

df[~df.loc[:,'col100':'col180'].isin([0,-1]).any(1)]

How it works

df.loc[:,'col100':'col180']- Selects the columns between 100 and 180

df.loc[:,'col100':'col180'].isin([0,-1]).any(1): Tests if there is 0 or -1 in the columns

CodePudding user response:

How about

cols_to_remove = [f'col{n}' for n in range(100,181)]
df[~df[cols_to_remove].isin([-1, 0]).any(1)]
  •  Tags:  
  • Related