This is my dataset:
| Dept | Cell culture | Bioinfo | Immunology | Trigonometry | Algebra | Biotech | Optics |
|---|---|---|---|---|---|---|---|
| Biotech | 1 | 1 | 1 | 0 | 0 | 1 | 0 |
| Math | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
| Physics | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
How I want my result:
| Dept | 0 |
|---|---|
| Biotech | Cell culture |
| Biotech | Bioinfo |
| Biotech | Immunology |
| Math | Trigonometry |
| Math | Algebra |
| Physics | Optics |
I need to form pairs that have the value one, but I also need to rid of those values which are the same in both column and row index - such as biotech here. Is there an easy way to do this?
CodePudding user response:
Try this:
#df = df.set_index('Dept') if needed move dept into the index
df.dot(df.columns ',').str.strip(',').str.split(',').explode().reset_index()
Output:
Dept 0
0 Biotech Cell culture
1 Biotech Bioinfo
2 Biotech Immunology
3 Biotech Biotech
4 Math Trigonometry
5 Math Algebra
6 Physics Optics
