I have a df as the following:
col1 col2
-----------
a 1b
a 1b
a 1a
b 2a
b 3f
And I want to count how many unique pairs each col1 element has: output:
(a, 2)
(b, 2)
CodePudding user response:
You want to count the number if unique values on col2 after grouping on col1 -
df.groupby(['col1']).nunique()['col2']
#col1
#a 2
#b 2
If you want it in the format you mentioned, you can pass it into zip -
list(zip(df.groupby(['col1']).nunique()['col2'].index, df.groupby(['col1']).nunique()['col2'].values))
[('a', 2), ('b', 2)]
CodePudding user response:
As a DataFrame
df.groupby("col1", as_index=False).nunique()
col1 col2
0 a 2
1 b 2
In the format mentioned;
list(df.groupby("col1", as_index=False).nunique().to_records(index=False))
[('a', 2), ('b', 2)]
CodePudding user response:
df.groupby(['col1', 'col2']).size()
CodePudding user response:
df.drop_duplicates(['col1', 'col2'])[['col1']].value_counts()
or
list(map(tuple, df.groupby('col1', as_index=False).nunique().values))
