How do I perform the following dataframe operation going from Dataframe A to dataframe B in pandas for python? I have tried pivot and groupby but I keep getting errors. Any support is greatly appreciated.
DataFrame A
| Col A | Col B |
|---|---|
| 100 | 1 |
| 100 | 2 |
| 200 | 3 |
| 200 | 4 |
DataFrame B
| Col A & B |
|---|
| 1 |
| 2 |
| 100 |
| 3 |
| 4 |
| 200 |
CodePudding user response:
One option using groupby:
out = (df
.groupby('Col A', group_keys=False, sort=False)
.apply(lambda d: d.iloc[:, ::-1].unstack().drop_duplicates())
.reset_index(drop=True).to_frame(name='Col A&B')
)
Another with concat:
out = (pd
.concat([df['Col B'], df['Col A'].drop_duplicates(keep='last')])
.sort_index().reset_index(drop=True).to_frame(name='Col A&B')
)
output:
Col A&B
0 1
1 2
2 100
3 3
4 4
5 200
If order does not matter, you can stack:
out = df.stack().drop_duplicates().reset_index(drop=True).to_frame(name='Col A&B')
output:
Col A&B
0 100
1 1
2 2
3 200
4 3
5 4
CodePudding user response:
Another possible solution:
out = pd.DataFrame({'Col A&B': np.unique(df)})
out
Output:
Col A&B
0 1
1 2
2 3
3 4
4 100
5 200
