Pandas read excel rows as dictionary of dictionary-CodePudding

I have a sample excel data as:

Col1    Col2    Col3        Col4    Col5 . . .
MIT1    AD      Profile     MM      MD
MIT1    AES     Document    MM      MD
MA3     ALL     Price       Retail  -
MA4     BEG     Group       MM      Eh

I want to read this excel in this format:

{'MIT1': [{'AD':['Profile','MM', 'MD']},{'AES':['Document','MM', 'MD']},...]... }

Where Col1 becomes Key of outer dictionary, Col2 becomes Key of inner dictionary and Col3 onwards(there are more columns but for sake I have shown only 5) becomes values as list of inner dictionary.

I wrote this code but it doesn't create the desired format:

    df1 = pd.read_excel('Example.xlsx', sheet_name='Datamodel',header = [0])
    
       
    df1 = df1.dropna()
    df1_dict = df1.to_dict()

How to get the correct format?

CodePudding user response：

You can create Series with MultiIndex by first 2 columns and all another columns convert to list and then create nested dicts:

s = df.set_index(['Col1','Col2']).agg(list, axis=1)
d = {level: s.xs(level).to_dict() for level in s.index.levels[0]}
print (d)
{'MA3': {'ALL': ['Price', 'Retail', '-']}, 
 'MA4': {'BEG': ['Group', 'MM', 'Eh']}, 
 'MIT1': {'AD': ['Profile', 'MM', 'MD'], 
         'AES': ['Document', 'MM', 'MD']}}

CodePudding user response：

What you want to achieve is impossible, dictionaries cannot have duplicated keys.

An alternative might be:

out = {k: v.set_index('Col2').T.to_dict('list')
       for k, v in df.set_index('Col1').groupby(level=0)}

output:

{'MA3': {'ALL': ['Price', 'Retail', '-']},
 'MA4': {'BEG': ['Group', 'MM', 'Eh']},
 'MIT1': {'AD': ['Profile', 'MM', 'MD'],
         'AES': ['Document', 'MM', 'MD']}}