Home > Mobile >  Data manipulation- Python
Data manipulation- Python

Time:01-30

Here is my dataset:

  •     COL_1    COL_2    COL_2   COL_4   COL_4  COL_8    COL_9
        A         col2    col3    col4    col5
        B         col2    col3    col4    col5
        C         col2    col3    col4    col5
    

I need to move pairs of columns (COL_2 , COL_3 and then COL_4, COL_5) to COL_8 and COL_9 and copy the content of the 1st column (COL_1) as well as put the names of columns (COL_2, COL_4) to the other column (COL_10) this way:

  •    COL_1  COL_8   COL_9   COL10
       A       col2   col3    COL2
       B       col2   col3    COL2
       C       col2   col3    COL2
       A       col4   col5    COL4
       B       col4   col5    COL4
       C       col4   col5    COL4
    

Please note that some of column names in initial dataset are identical. How can I do it using Python?

CodePudding user response:

To achieve the illustrated result you could use the function concat of pandas and create the new dataframe's column like this:

new_df = pd.DataFrame()

new_df['COL_1'] = pd.concat([df['COL_1'], df['COL_1']])
new_df['COL_8'] = pd.concat([df['COL_2'], df['COL_4']])
new_df['COL_9'] = pd.concat([df['COL_3'], df['COL_5']])
new_df['COL_10'] = pd.concat([df['COL_2'], df['COL_4']])

CodePudding user response:

This should work:

part1 = df[['COL_1', 'COL_2']].set_axis([0,1,2], axis=1)
part2 = df[['COL_1', 'COL_4']].set_axis([0,1,2], axis=1)
new_df = pd.concat([part1, part2], ignore_index=True)
new_df[3] = new_df[1]
new_df.columns = ['COL_1', 'COL_8', 'COL_9', 'COL_10']

Output:

>>> new_df
  COL_1 COL_8 COL_9 COL_10
0     A  col2  col3   col2
1     B  col2  col3   col2
2     C  col2  col3   col2
3     A  col4  col5   col4
4     B  col4  col5   col4
5     C  col4  col5   col4
  •  Tags:  
  • Related