Home > OS >  Create a new column based on conditional count of other column grouped by id
Create a new column based on conditional count of other column grouped by id

Time:01-21

I have a dataframe like this: My goal is to count the number of countries grouped by id but only (usa, canada, mex)

id country desired_output
a usa 1
b canada 2
b canada 2
c china 0
d mexico 3
d mexico 3
d mexico 3

CodePudding user response:

You can do that with the following

df["desired_output"] = df.groupby(["id"])["id"].transform("count")
df.loc[~df['country'].isin(["usa","canada","mexico"]), 'desired_output'] = 0

CodePudding user response:

Use Series.isin with Groupby.transform and Series.fillna:

In [21]: df['desired_output'] = df[df.country.isin(["usa","canada","mexico"])].groupby('id')['country'].transform('size')

In [24]: df['desired_output'] = df['desired_output'].fillna(0).astype(int)

In [25]: df
Out[26]: 
  id country  desired_output
0  a     usa               1
1  b  canada               2
2  b  canada               2
3  c   china               0
4  d  mexico               3
5  d  mexico               3
6  d  mexico               3

CodePudding user response:

Convert id or some another column to NaN in Series.where and Series.isin and then count non NaNs rows by GroupBy.transform with GroupBy.count:

L = ["usa","canada","mexico"]
df['desired_output'] = df.id.where(df.country.isin(L)).groupby(df['id']).transform('count')
print (df)
  id country  desired_output
0  a     usa               1
1  b  canada               2
2  b  canada               2
3  c   china               0
4  d  mexico               3
5  d  mexico               3
6  d  mexico               3
  •  Tags:  
  • Related