I have following dictionary:
my_dict = dict([(779825550, [[346583, 2, 305.98, 9]]), (779825605, [[276184, 2, 169.5, 15], [331465, 2, 214.5, 15], [276184, 2, 169.5, 15], [331465, 2, 214.5, 15], [637210, 2, 368.5, 15], [249559, 2, 133.46, 15], [591652, 2, 132.0, 15], [216367, 2, 142.5, 14]]), (779825644, [[568025, 13, 494.5, 15]]), (779825657, [[75366, 18, 43.26, 9]])])
I need to convert this dict into pandas df. In each row I need my_dict key (that is 779825550, 779825605 etc) followed by values in the list of list. So the first row will be: 779825550, 346583, 2, 305.98, 9. If there is more lists in the list (like for 779825605) I need to have more rows with the same key in the first column (that is 779825605, 276184, 2, 169.5, 15 and 779825605, 276184, 2, 169.5, 15 etc). How can I do this please?
I tried:
df = pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in my_dict.items() ]))
but it gives me a wrong result. Thanks
CodePudding user response:
You can flatten nested lists and add k with unpack nested lists by *, last pass to DataFrame constructor:
df = pd.DataFrame((k, *x) for k,v in my_dict.items() for x in v)
print (df)
0 1 2 3 4
0 779825550 346583 2 305.98 9
1 779825605 276184 2 169.50 15
2 779825605 331465 2 214.50 15
3 779825605 276184 2 169.50 15
4 779825605 331465 2 214.50 15
5 779825605 637210 2 368.50 15
6 779825605 249559 2 133.46 15
7 779825605 591652 2 132.00 15
8 779825605 216367 2 142.50 14
9 779825644 568025 13 494.50 15
10 779825657 75366 18 43.26 9
Your solution should be changed by DataFrame constructor with concat:
df = pd.concat(dict((k,pd.DataFrame(v)) for k,v in my_dict.items()))
print (df)
0 1 2 3
779825550 0 346583 2 305.98 9
779825605 0 276184 2 169.50 15
1 331465 2 214.50 15
2 276184 2 169.50 15
3 331465 2 214.50 15
4 637210 2 368.50 15
5 249559 2 133.46 15
6 591652 2 132.00 15
7 216367 2 142.50 14
779825644 0 568025 13 494.50 15
779825657 0 75366 18 43.26 9
CodePudding user response:
We can also use itertools.product to find the Cartesian product of keys and sublists in values (that we combine to make lists) and cast to a DataFrame constructor:
import itertools
df = pd.DataFrame(i j for k,v in my_dict.items() for i,j in itertools.product([[k]],v))
We can also iterate over my_dict.items and create lists:
df = pd.DataFrame([k] lst for k, lsts in my_dict.items() for lst in lsts)
Output:
0 1 2 3 4
0 779825550 346583 2 305.98 9
1 779825605 276184 2 169.50 15
2 779825605 331465 2 214.50 15
3 779825605 276184 2 169.50 15
4 779825605 331465 2 214.50 15
5 779825605 637210 2 368.50 15
6 779825605 249559 2 133.46 15
7 779825605 591652 2 132.00 15
8 779825605 216367 2 142.50 14
9 779825644 568025 13 494.50 15
10 779825657 75366 18 43.26 9
