Merging Column values of Different dataframe into one-CodePudding

I have two dataframes like

df1:

1 2 3 4
5 6 7 8
1 2 3 4

df2:

1 2 3 4
5 6 7 8
7 8 9 1

Raw data:

df1 = pd.DataFrame({'col1': [1, 5, 1], 'col2': [2, 6, 2], 
                    'col3': [3, 7, 3], 'col4': [4, 8, 4]})
df2 = pd.DataFrame({'col1': [1, 5, 7], 'col2': [2, 6, 8], 
                    'col3': [3, 7, 9], 'col4': [4, 8, 1]})

the desired output is

final_df:

[1,1] [2,2] [3,3] [4,4]
[5,5] [6,6] [7,7] [8,8]
[1,7] [2,8] [3,9] [4,1]

CodePudding user response：

You can convert each dataframe to numpy arrays, use np.dstack to stack them along the third axis and pass it back to a pandas DataFrame:

out = pd.DataFrame(np.dstack([df1.to_numpy(), df2.to_numpy()]).tolist(), columns=df1.columns)

Output:

     col1    col2    col3    col4
0  [1, 1]  [2, 2]  [3, 3]  [4, 4]
1  [5, 5]  [6, 6]  [7, 7]  [8, 8]
2  [1, 7]  [2, 8]  [3, 9]  [4, 1]

CodePudding user response：

A simple solution is to wrap each number in a list and a simple addition will work. The advantage is that rows/columns don't need to be identical nor in the same order as pandas performs an alignement.

The numpy approach will be more efficient for large dataframes though.

f = lambda x: [x]
df1.applymap(f) df2.applymap(f)

Output:

     col1    col2    col3    col4
0  [1, 1]  [2, 2]  [3, 3]  [4, 4]
1  [5, 5]  [6, 6]  [7, 7]  [8, 8]
2  [1, 7]  [2, 8]  [3, 9]  [4, 1]