Merge data sheets to same column-CodePudding

I have two different data (students performance).

dict1 --- performance for the 1st semester
dict2 --- performance for the 2nd semester

I need to concatenate two df's so that sub-columns for semesters appear in the columns of disciplines.

import pandas as pd
dict1  = {'Students': ['A', 'B', 'C'], 'Dicsipline1': ['a', 'na', 'a'], 'Dicsipline2': ['a', 'na', 'a']}
dict2 = {'Students': ['A', 'B', 'C'], 'Dicsipline1': ['na', 'a', 'a'], 'Dicsipline2': ['a', 'a', 'a']}
df1 = pd.DataFrame(dict1)
df2 = pd.DataFrame(dict2)

Desired result like

I need to add one level.

CodePudding user response：

Use:

df = pd.concat({'1': df1.set_index('Students'),
                '2': df2.set_index('Students')},
               axis=1).swaplevel(axis=1).sort_index(axis=1)

Or:

df = pd.concat({'1': df1.set_index('Students'),
                '2': df2.set_index('Students')}).unstack(level=0)

             Dicsipline1     Dicsipline2   
                   1   2           1  2
Students                               
A                  a  na           a  a
B                 na   a          na  a
C                  a   a           a  a

Note you can use reset_index() at the end in order to include students in the columns

CodePudding user response：

Do you mean something like this:

linked_dict = {key: [val for val in values   dict2[key]] for key, values in dict1.items()}
df = pd.DataFrame(linked_dict)

? If not, please be more specific what do you want to achieve (maybe show the desired result).

Edited: You can do it like this (it is not what you want, but nearby):

dict_1_new = {(key, f'Semester1') if 'Dicsipline' in key else key: value for key, value in dict1.items()}
dict_2_new = {(key, f'Semester2') if 'Dicsipline' in key else key: value for key, value in dict2.items()}
linked_dict = dict_1_new | dict_2_new
df = pd.DataFrame(linked_dict)

CodePudding user response：

The below code will work for your case

import pandas as pd

dict1  = {'Students': ['A', 'B', 'C'], 'Dicsipline1': ['a', 'na', 'a'], 'Dicsipline2': ['a', 'na', 'a']}
dict2 = {'Students': ['A', 'B', 'C'], 'Dicsipline1': ['na', 'a', 'a'], 'Dicsipline2': ['a', 'a', 'a']}
df1 = pd.DataFrame(dict1)
df2 = pd.DataFrame(dict2)

df = (pd.concat([df1.set_index('Students'), 
                df2.set_index('Students')], 
                axis=1, 
                keys=['0','1'])
        .swaplevel(0,1,axis=1)
        .sort_index(axis=1, ascending=[True, False])
        )
print (df)

The output will be as:

         Dicsipline1     Dicsipline2    
                   1   0           1   0
Students
A                 na   a           a   a
B                  a  na           a  na
C                  a   a           a   a