Home > Net >  Combine mutually exclusive arguments in filter condition
Combine mutually exclusive arguments in filter condition

Time:02-01

I have a large Pandas DataFrame with >100 columns and I would like to select all columns where the substring einkst_l appears in the column name. In addition, I want to select the two columns name and year.

So far, I could only create two new data frames:

e = 'einkst_l'
df_1 = df.filter(like = e, axis=1).reset_index(drop=True)
df_2 = df.filter(items = ['name', 'year'], axis=1).reset_index(drop=True)

I would like to select all the columns in one shot, but unfortunately 'like' and 'items' cannot be combined in one statement. How can I select name year all columns containing the specified substring all at once?

CodePudding user response:

This is more fuzzy but you could just use regex match like.

df[df.columns[df.columns.str.contains('einkst_l|name|year')]]

Also, could use ^ or $ to make match exactly for name and year.

CodePudding user response:

Try without filter using str accessors:

Replace:

  • like by contains
  • items by isin
out = df[df.columns[df.columns.str.contains('einkst_l')
                    | df.columns.isin(['name', 'year'])]]

CodePudding user response:

You can try something like a "nested filtering":

df.filter(like = e, axis=1).filter(items = ['name', 'jahr'], axis=1)
  •  Tags:  
  • Related