Home > Mobile >  Getting unique col values based on condition in pandas
Getting unique col values based on condition in pandas

Time:01-27

I have a dataframe like so:

Input:

IP    Flag
101   Yes
101   Yes
101   No
103   Yes
103   Maybe
103   Yes
104   No
104   Yes

How can I get a list of unique IPs that have a flag of Yes and No?

Expected Output:

IP
101
104

CodePudding user response:

You could reshape your dataset with pd.crosstab grab only 'No' and 'Yes' columns and check which rows have no 0's at all (which should be True to work for you as you require that each IP has both yes and no):

m = pd.crosstab(df['IP'],df['Flag'])[['No','Yes']].ne(0).all(1)
print(m)

Out[100]: 
IP
101     True
103    False
104     True
dtype: bool

print(m[m].index.tolist())
[101, 104]

CodePudding user response:

One way is to groupby and aggregate a custom function:

mask = df.groupby('IP')['Flag'].agg(lambda x: {'Yes','No'} <= set(x))
mask.index[mask]

Output:

Int64Index([101, 104], dtype='int64', name='IP')
  •  Tags:  
  • Related