I have a pandas dataframe like:
**CATEGORY USER**
A Alfa
A Bravo
B Charlie
B Delta
C Delta
D Echo
C Foxtrot
E Foxtrot
D Golf
F Hotel
F India
G Juliett
H Kilo
I want to group by the first column and get second column as lists in rows:
CATEGORY USER RELATION Lv. 1 *RELATION Lv. 2***
A Alfa [Alfa, Bravo] [Alfa, Bravo]
A Bravo [Alfa, Bravo] [Alfa, Bravo]
B Charlie [Charlie, Delta] [Charlie, Delta, Foxtrot]
B Delta [Charlie, Delta] [Charlie, Delta, Foxtrot]
C Delta [Delta, Foxtrot] [Charlie, Delta, Foxtrot]
D Echo [Echo, Golf] [Echo, Golf]
C Foxtrot [Delta, Foxtrot] [Charlie, Delta, Foxtrot]
E Foxtrot [Foxtrot] [Charlie, Delta, Foxtrot]
D Golf [Echo, Golf] [Echo, Golf]
F Hotel [Hotel, India] [Hotel, India]
F India [Hotel, India] [Hotel, India]
G Juliett [Juliett] [Juliett]
H Kilo [Kilo] [Kilo]
Is it possible to do something like this using pandas groupby?
CodePudding user response:
We can agg to the list then reindex assign it back
df['new'] = df.groupby('CATEGORY')['USER'].agg(list).reindex(df.CATEGORY).tolist()
df
CATEGORY USER new
0 A Alfa [Alfa, Bravo]
1 A Bravo [Alfa, Bravo]
2 B Charlie [Charlie, Delta]
3 B Delta [Charlie, Delta]
4 C Delta [Delta, Foxtrot]
5 D Echo [Echo, Golf]
6 C Foxtrot [Delta, Foxtrot]
7 E Foxtrot [Foxtrot]
8 D Golf [Echo, Golf]
9 F Hotel [Hotel, India]
10 F India [Hotel, India]
11 G Juliett [Juliett]
12 H Kilo [Kilo]
CodePudding user response:
Here's one approach:
(i) groupby "CATEGORY" and create sets of "USER"s for each category
(ii) Using a double loop, create sets of second order relations using set.intersection and iteratively update dictionary d using dict.setdefault
(iii) Convert the sets to lists and map to "CATEGORY" column to match df rows.
sets = df.groupby('CATEGORY')['USER'].apply(set)
d = {}
for k, st1 in zip(sets.index.tolist(), sets):
for st2 in sets:
if st1.intersection(st2):
d.setdefault(k, set()).update(st2)
df['RELATION Lv. 1'] = [list(x) for x in df['CATEGORY'].map(sets)]
df['RELATION Lv. 2'] = [list(x) for x in df['CATEGORY'].map(d)]
Output:
CATEGORY USER RELATION Lv. 1 RELATION Lv. 2
0 A Alfa [Alfa, Bravo] [Alfa, Bravo]
1 A Bravo [Alfa, Bravo] [Alfa, Bravo]
2 B Charlie [Delta, Charlie] [Foxtrot, Delta, Charlie]
3 B Delta [Delta, Charlie] [Foxtrot, Delta, Charlie]
4 C Delta [Foxtrot, Delta] [Foxtrot, Delta, Charlie]
5 D Echo [Echo, Golf] [Echo, Golf]
6 C Foxtrot [Foxtrot, Delta] [Foxtrot, Delta, Charlie]
7 E Foxtrot [Foxtrot] [Foxtrot, Delta]
8 D Golf [Echo, Golf] [Echo, Golf]
9 F Hotel [Hotel, India] [Hotel, India]
10 F India [Hotel, India] [Hotel, India]
11 G Juliett [Juliett] [Juliett]
12 H Kilo [Kilo] [Kilo]
