Home > Enterprise >  Filter group contains specific values
Filter group contains specific values

Time:01-25

I want to filter group with specific values. Take the sample data below:

df = pd.DataFrame({'comp': ['a','b','a','b','c','d', 'c', 'c', 'd'], 'grid': [1,2,2,4,1,2,1,2,2]})

The df:

  comp  grid  val
0    a     1   10
1    b     2   11
2    a     2   12
3    b     4   13
4    c     1   14
5    d     2   15
6    c     1   16
7    c     2   17
8    d     2   18

I want to groupby comp and check if in each group the grid has at least one value 1 and one value 2. The expect outcome is group a and group c

  comp  grid  val
0    a     1   10
2    a     2   12
4    c     1   14
6    c     1   16
7    c     2   17

I try with groupby and transform but not successful

CodePudding user response:

You can use GroupBy.transform with compare sets:

#if contains 1,2 and some another values
df = df[df.groupby('comp')['grid'].transform(lambda x: set(x) >= set([1,2]))]
print (df)
  comp  grid
0    a     1
2    a     2
4    c     1
6    c     1
7    c     2

#if contains only 1,2 
df = df[df.groupby('comp')['grid'].transform(lambda x: set(x) == set([1,2]))]
print (df)
  comp  grid
0    a     1
2    a     2
4    c     1
6    c     1
7    c     2

If need at least one 1 and one 2 is possible use:

intersect = np.intersect1d(df.loc[df['grid'].eq(1),'comp'], df.loc[df['grid'].eq(2),'comp'])
print (intersect)
['a' 'c']

df = df[df['comp'].isin(intersect)]
print (df)
  comp  grid
0    a     1
2    a     2
4    c     1
6    c     1
7    c     2
  •  Tags:  
  • Related