Home > database >  Using pandas groupby to write new information into the original DataFrame?
Using pandas groupby to write new information into the original DataFrame?

Time:01-11

I have two columns in my dataframe that I want to group by and assign ids to.

df = pd.DataFrame({'A' : [1, 2, 3, 4,
                          3, 4],
                   'B' : [1, 2, 3, 4,
                          5, 4]})

A   B
1   1
2   2
3   3
4   4
3   5
4   4


grouped = df.groupby(['A','B'])

returns

A   B
1   1
2   2
3   3
    5
4   4

I am trying to assign a unique id to each grouping.

def idx(x):
    return str(uuid.uuid4())

grouped.agg(lambda x: idx(x))

which returns a pandas series

A  B
1  1    ab6ac10e-7dbc-43a4-9f93-cc0c83ec2d03
2  2    c26548ec-9002-4ad5-bad9-c84f8c594c9b
3  3    8daab68b-51aa-42b3-8546-3b64ee73f460
   5    cb8f7da1-81de-4bed-8ae9-790c64ac66e2
4  4    b742a9e0-ba08-42f2-b9e8-13cf6c3b0dbe
dtype: object

what I am trying to do is write this series of unique ids back into the original dataframe. I expect something like this:

A   B   idx
1   1   ab6ac10e-7dbc-43a4-9f93-cc0c83ec2d03
2   2   c26548ec-9002-4ad5-bad9-c84f8c594c9b
3   3   8daab68b-51aa-42b3-8546-3b64ee73f460
4   4   b742a9e0-ba08-42f2-b9e8-13cf6c3b0dbe
3   5   cb8f7da1-81de-4bed-8ae9-790c64ac66e2
4   4   b742a9e0-ba08-42f2-b9e8-13cf6c3b0dbe

CodePudding user response:

Check your output with reindex

df['new'] = grouped.agg(lambda x: idx(x)).reindex(pd.MultiIndex.from_frame(df)).values
  •  Tags:  
  • Related