i want filter all data on condition type have contains() or subset() 'NCO - ETD' follow groupby date and id.
I wrote this code:
cond = 'NCO - ETD'
mask = data.groupby(['Date','Id'])['Type'].agg(set).apply(lambda x: any(x.issubset(cond)))
but TypeError: 'bool' object is not iterable
CodePudding user response:
If need subset use list from cond and remove apply with any:
mask = data.groupby(['Date','Id'])['Type'].agg(lambda x: set(x).issubset([cond]))
Or if need test substring create helper column and then test it at least one True per groups by any:
cond = 'NCO - ETD'
mask = (data.assign(new = data['Type'].str.contains(cond))
.groupby(['Date','Id'])['new']
.transform('any'))
