Home > Blockchain >  Include indices in Pandas groupby results
Include indices in Pandas groupby results

Time:02-03

With Pandas groupby, I can do things like this:

>>> df = pd.DataFrame(
...     {
...         "A": ["foo", "bar", "bar", "foo", "bar"],
...         "B": ["one", "two", "three", "four", "five"],
...     }
... )
>>> print(df)
     A      B
0  foo    one
1  bar    two
2  bar  three
3  foo   four
4  bar   five
>>> print(df.groupby('A')['B'].unique())
A
bar    [two, three, five]
foo           [one, four]
Name: B, dtype: object

What I am looking for is output that produces a list of indices instead of a list of column B:

A
bar    [1, 2, 4]
foo    [0, 3]

However, groupby('A').index.unique() doesn't work. What syntax would provide me the output I'm after? I'd be more than happy to do this in some other way than with groupby, although I do need to group by two columns in my real application.

CodePudding user response:

You do not necessarily need to have a label in groupby, you can use a grouping object.

This enables things like:

df.index.to_series().groupby(df['A']).unique()

output:

A
bar    [1, 2, 4]
foo       [0, 3]
dtype: object
getting the indices of the unique B values:
df[~df[['A', 'B']].duplicated()].index.to_series().groupby(df['A']).unique()

CodePudding user response:

Use enter image description here

  •  Tags:  
  • Related