Here is my dataset:
-
COL_1 COL_2 COL_2 COL_4 COL_4 COL_8 COL_9 A col2 col3 col4 col5 B col2 col3 col4 col5 C col2 col3 col4 col5
I need to move pairs of columns (COL_2 , COL_3 and then COL_4, COL_5) to COL_8 and COL_9 and copy the content of the 1st column (COL_1) as well as put the names of columns (COL_2, COL_4) to the other column (COL_10) this way:
-
COL_1 COL_8 COL_9 COL10 A col2 col3 COL2 B col2 col3 COL2 C col2 col3 COL2 A col4 col5 COL4 B col4 col5 COL4 C col4 col5 COL4
Please note that some of column names in initial dataset are identical. How can I do it using Python?
CodePudding user response:
To achieve the illustrated result you could use the function concat of pandas and create the new dataframe's column like this:
new_df = pd.DataFrame()
new_df['COL_1'] = pd.concat([df['COL_1'], df['COL_1']])
new_df['COL_8'] = pd.concat([df['COL_2'], df['COL_4']])
new_df['COL_9'] = pd.concat([df['COL_3'], df['COL_5']])
new_df['COL_10'] = pd.concat([df['COL_2'], df['COL_4']])
CodePudding user response:
This should work:
part1 = df[['COL_1', 'COL_2']].set_axis([0,1,2], axis=1)
part2 = df[['COL_1', 'COL_4']].set_axis([0,1,2], axis=1)
new_df = pd.concat([part1, part2], ignore_index=True)
new_df[3] = new_df[1]
new_df.columns = ['COL_1', 'COL_8', 'COL_9', 'COL_10']
Output:
>>> new_df
COL_1 COL_8 COL_9 COL_10
0 A col2 col3 col2
1 B col2 col3 col2
2 C col2 col3 col2
3 A col4 col5 col4
4 B col4 col5 col4
5 C col4 col5 col4
