There is a table in a CSV file format:
| A | B |
|---|---|
| 35480007 | 0695388 |
| 35480007 | 0695388 |
| 35407109 | 3324741 |
| 35407109 | 3324741 |
| 35250208 | 0695388 |
| 35250208 | 6104556 |
| 86730903 | 3360935 |
| 86730903 | 3360935 |
Could you please tell me how can data aggregation be done using the pandas library to display information about which values from column B intersect with column A?
As a result, I need to display the following information:
The value 0695388 from column B corresponds to the values from column A: 35480007, 35250208, etc. duplicates from column A are not taken into account.
CodePudding user response:
Try with groupby:
>>> df.groupby("B")["A"].unique()
B
695388 [35480007, 35250208]
3324741 [35407109]
3360935 [86730903]
6104556 [35250208]
Name: A, dtype: object
