I have a python dictionary in this format.
d = {1: {1, 2, 3},
2: {4, 5}}
I want to convert it into a pandas dataframe in this format.
Expected Output:
Source Target
1 1
1 2
1 3
2 4
2 5
I have tried doing this using list comprehension
d = {1: {1, 2, 3}, 2: {4, 5}}
df=pd.DataFrame([[key,v] for key, value in d.items() for v in value], columns=["Source", "Target"])
print(df)
But, is there any better way of doing this?
CodePudding user response:
You can use df.explode:
import pandas as pd
d = {
1: {1, 2, 3},
2: {4, 5}
}
df = pd.DataFrame(d.items(), columns=['Source', 'Target'])
df = df.explode('Target')
Which gives
Source Target
0 1 1
0 1 2
0 1 3
1 2 4
1 2 5
Here, we create the dataframe with multiple values for each Target, and explode then creates a new row for each value in target.
Notice that the index still reflects the original dataframe, so we can use:
df = df.reset_index(drop=True)
To reset it to
Source Target
0 1 1
1 1 2
2 1 3
3 2 4
4 2 5
Which combined gives us
df = df.explode('Target').reset_index(drop=True)
CodePudding user response:
You could create the DataFrame from each key:value pair in the dictionary and then concat them together.
import pandas as pd
pd.concat([pd.DataFrame({'Source': k, 'Target': tuple(v)}) for k,v in d.items()],
ignore_index=True)
Or, you can use the pd.DataFrame.from_dict constructor, and stack, with a bunch of renaming
(pd.DataFrame.from_dict(d, orient='index')
.stack()
.reset_index(-1, drop=True)
.rename('Target').rename_axis(index='Source')
.reset_index()
.astype(int))
Source Target
0 1 1
1 1 2
2 1 3
3 2 4
4 2 5
