How to efficiently filter df by multiple dictionary sets. The example will be as following:
df = pd.DataFrame({'A':[10,20,20,10,20], 'B':[0,1,0,1,1], 'C':['up','down','up','down','down'],'D':[100,200,200,100,100]})
filter_sets = [{'A':10, 'B':0, 'C':'up'}, {'A':20, 'B':1, 'C':'down'}]
I only know that I can filter df by single dictionary by:
df.loc[(df[list(filter_set)] == pd.Series(filter_set)).all(axis=1)]
But is it possible to filter several dict masks at once?
** The format of filter_sets is not necessary to be something like above. If it can provide filter for multiple columns, then it is fine.
CodePudding user response:
Use np.logical_or.reduce with list comprehension:
mask = np.logical_or.reduce([(df[list(x)]==pd.Series(x)).all(axis=1) for x in filter_sets])
#alternative solution
mask = (pd.concat([(df[list(x)]==pd.Series(x)).all(axis=1) for x in filter_sets], axis=1)
.any(axis=1))
df2 = df[mask]
print (df2)
A B C D
0 10 0 up 100
1 20 1 down 200
4 20 1 down 100
Or if all keys are same is possible create helper DataFrame with merge:
df2 = pd.DataFrame(filter_sets).merge(df)
print (df2)
A B C D
0 10 0 up 100
1 20 1 down 200
2 20 1 down 100
