Create n groups from a csv file-CodePudding

I have a csv file with the following structure

Name	Position
Albert	Producer
John	Director
Anna	Volunteer
Gaston	Tech
other 51 names	other 51 positions

I need to create random groups where each group has one member from each position. There are only four positions across the dataset. The final result would be something like

Name	Position	Group
Albert	Producer	Group 1
John	Director	Group 1
Anna	Volunteer	Group 1
Gaston	Tech	Group 1
Martin	Producer	Group 2
Vanessa	Director	Group 2
Charles	Volunteer	Group 2
Milan	Tech	Group 2

What is the most efficient way to create these groups?

Thank you in advance!

CodePudding user response：

You can use two groupby. One to shuffle the Positions using sample(frac=1), and the other to assign the groups:

(df.groupby('Position').pipe(lambda d: d.sample(frac=1))
   .assign(Group=lambda d: d.groupby('Position').cumcount().add(1))
   .sort_values(by=['Group', 'Position']) # optional
)

example output:

      Name   Position  Group
1     John   Director      1
4   Martin   Producer      1
3   Gaston       Tech      1
6  Charles  Volunteer      1
5  Vanessa   Director      2
0   Albert   Producer      2
7    Milan       Tech      2
2     Anna  Volunteer      2