Home > database >  Dict to df if value is a list of lists
Dict to df if value is a list of lists

Time:01-12

I have following dictionary:

my_dict = dict([(779825550, [[346583, 2, 305.98, 9]]), (779825605, [[276184, 2, 169.5, 15], [331465, 2, 214.5, 15], [276184, 2, 169.5, 15], [331465, 2, 214.5, 15], [637210, 2, 368.5, 15], [249559, 2, 133.46, 15], [591652, 2, 132.0, 15], [216367, 2, 142.5, 14]]), (779825644, [[568025, 13, 494.5, 15]]), (779825657, [[75366, 18, 43.26, 9]])])

I need to convert this dict into pandas df. In each row I need my_dict key (that is 779825550, 779825605 etc) followed by values in the list of list. So the first row will be: 779825550, 346583, 2, 305.98, 9. If there is more lists in the list (like for 779825605) I need to have more rows with the same key in the first column (that is 779825605, 276184, 2, 169.5, 15 and 779825605, 276184, 2, 169.5, 15 etc). How can I do this please?

I tried:

df = pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in my_dict.items() ]))

but it gives me a wrong result. Thanks

CodePudding user response:

You can flatten nested lists and add k with unpack nested lists by *, last pass to DataFrame constructor:

df = pd.DataFrame((k, *x) for k,v in my_dict.items() for x in v)
print (df)
            0       1   2       3   4
0   779825550  346583   2  305.98   9
1   779825605  276184   2  169.50  15
2   779825605  331465   2  214.50  15
3   779825605  276184   2  169.50  15
4   779825605  331465   2  214.50  15
5   779825605  637210   2  368.50  15
6   779825605  249559   2  133.46  15
7   779825605  591652   2  132.00  15
8   779825605  216367   2  142.50  14
9   779825644  568025  13  494.50  15
10  779825657   75366  18   43.26   9

Your solution should be changed by DataFrame constructor with concat:

df = pd.concat(dict((k,pd.DataFrame(v)) for k,v in my_dict.items()))
print (df)
                  0   1       2   3
779825550 0  346583   2  305.98   9
779825605 0  276184   2  169.50  15
          1  331465   2  214.50  15
          2  276184   2  169.50  15
          3  331465   2  214.50  15
          4  637210   2  368.50  15
          5  249559   2  133.46  15
          6  591652   2  132.00  15
          7  216367   2  142.50  14
779825644 0  568025  13  494.50  15
779825657 0   75366  18   43.26   9

CodePudding user response:

We can also use itertools.product to find the Cartesian product of keys and sublists in values (that we combine to make lists) and cast to a DataFrame constructor:

import itertools
df = pd.DataFrame(i j for k,v in my_dict.items() for i,j in itertools.product([[k]],v))

We can also iterate over my_dict.items and create lists:

df = pd.DataFrame([k]   lst for k, lsts in my_dict.items() for lst in lsts)

Output:

            0       1   2       3   4
0   779825550  346583   2  305.98   9
1   779825605  276184   2  169.50  15
2   779825605  331465   2  214.50  15
3   779825605  276184   2  169.50  15
4   779825605  331465   2  214.50  15
5   779825605  637210   2  368.50  15
6   779825605  249559   2  133.46  15
7   779825605  591652   2  132.00  15
8   779825605  216367   2  142.50  14
9   779825644  568025  13  494.50  15
10  779825657   75366  18   43.26   9
  •  Tags:  
  • Related