I have two dataframes like
df1:
1 2 3 4
5 6 7 8
1 2 3 4
df2:
1 2 3 4
5 6 7 8
7 8 9 1
Raw data:
df1 = pd.DataFrame({'col1': [1, 5, 1], 'col2': [2, 6, 2],
'col3': [3, 7, 3], 'col4': [4, 8, 4]})
df2 = pd.DataFrame({'col1': [1, 5, 7], 'col2': [2, 6, 8],
'col3': [3, 7, 9], 'col4': [4, 8, 1]})
the desired output is
final_df:
[1,1] [2,2] [3,3] [4,4]
[5,5] [6,6] [7,7] [8,8]
[1,7] [2,8] [3,9] [4,1]
CodePudding user response:
You can convert each dataframe to numpy arrays, use np.dstack to stack them along the third axis and pass it back to a pandas DataFrame:
out = pd.DataFrame(np.dstack([df1.to_numpy(), df2.to_numpy()]).tolist(), columns=df1.columns)
Output:
col1 col2 col3 col4
0 [1, 1] [2, 2] [3, 3] [4, 4]
1 [5, 5] [6, 6] [7, 7] [8, 8]
2 [1, 7] [2, 8] [3, 9] [4, 1]
CodePudding user response:
A simple solution is to wrap each number in a list and a simple addition will work. The advantage is that rows/columns don't need to be identical nor in the same order as pandas performs an alignement.
The numpy approach will be more efficient for large dataframes though.
f = lambda x: [x]
df1.applymap(f) df2.applymap(f)
Output:
col1 col2 col3 col4
0 [1, 1] [2, 2] [3, 3] [4, 4]
1 [5, 5] [6, 6] [7, 7] [8, 8]
2 [1, 7] [2, 8] [3, 9] [4, 1]
