I have a dataframse such as
Groups SP Letters
G1 SP1 A
G1 SP1 Z
G1 SP1 A
G1 SP2 X
G1 SP3 X
G1 SP4 B
G2 SP2 A
G2 SP5 B
G2 SP3 X
G2 SP23 A
and I would like to add a column (Nb_column) which would be the count of unique SP value within each Groups but where Letters are in a list the_list=['A','B']
here I should then get:
Groups SP Letters Nb_column
G1 SP1 A 2
G1 SP1 Z 2
G1 SP1 A 2
G1 SP2 X 2
G1 SP3 X 2
G1 SP4 B 2
G2 SP2 A 3
G2 SP5 B 3
G2 SP3 X 3
G2 SP23 A 3
Where
- there are 2
SP(SP1,SP4) with aLettersinthe_listfor the G1. And - there are 3
SP(SP2,SP5 and SP23) with aLettersinthe_listfor the G2.
Does someone have an idea in pandas please ?
CodePudding user response:
You can use groupby transform nunique on a masked version of SP (by default the NaNs are not counted by nunique):
df['Nb_columns'] = (df['SP'].where(df['Letters'].isin(the_list))
.groupby(df['Groups']).transform('nunique')
)
output:
Groups SP Letters Nb_columns
0 G1 SP1 A 2
1 G1 SP1 Z 2
2 G1 SP1 A 2
3 G1 SP2 X 2
4 G1 SP3 X 2
5 G1 SP4 B 2
6 G2 SP2 A 3
7 G2 SP5 B 3
8 G2 SP3 X 3
9 G2 SP23 A 3
CodePudding user response:
Use isin to check for the existence, then groupby().nunique() and map back:
nb = df[df['Letters'].isin(lst)].groupby('Groups')['SP'].nunique()
df['Nb_column'] = df['Groups'].map(nb)
CodePudding user response:
We can select from the SP column based on where Letters isin the_list using loc then use groupby nunique to get the corresponding number of unique values per group. Then rename and join back to the DataFrame to make the new column aligned with Groups:
df = df.join(
df.loc[df['Letters'].isin(the_list), 'SP']
.groupby(df['Groups']).nunique()
.rename('Nb_column'),
on='Groups'
)
Alternatively to join we can reindex to scale and create the new column from that:
df['Nb_column'] = (
df.loc[df['Letters'].isin(the_list), 'SP']
.groupby(df['Groups']).transform('nunique')
.reindex(index=df.index, method='ffill')
)
df:
Groups SP Letters Nb_column
0 G1 SP1 A 2
1 G1 SP1 Z 2
2 G1 SP1 A 2
3 G1 SP2 X 2
4 G1 SP3 X 2
5 G1 SP4 B 2
6 G2 SP2 A 3
7 G2 SP5 B 3
8 G2 SP3 X 3
9 G2 SP23 A 3
