Home > Net >  How to merge list of dictionaries in python in shortest and fastest way possible?
How to merge list of dictionaries in python in shortest and fastest way possible?

Time:01-20

I want to merge list of dictionaries in python. The number of dictionaries contained inside the list is not fixed and the nested dictionaries are being merged on both same and different keys. The dictionaries within the list do not contain nested dictionary. The values from same keys can be stored in a list.

My code is:

list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3} ...... ]
output = {}

for i in list_of_dict:
    for k,v in i.items():
        if k in output:
            output[k].append(v)
        else:
            output[k] = [v]

Is there a shorter and faster way of implementing this?

I am actually trying to implement the most fast way of doing this because the list of dictionary is very large and then there are lots of rows with such data.

CodePudding user response:

One way using collections.defaultdict:

from collections import defaultdict

res = defaultdict(list)

for d in list_of_dict:
    for k, v in d.items():
        res[k].append(v)

Output:

defaultdict(list,
            {'a': [1, 3, 3, 3],
             'b': [2, 5],
             'c': [3],
             'k': [5, 5],
             'j': [5],
             'd': [4]})

CodePudding user response:

One of the shortest way would be to

  • prepare a list/set of all the keys from all the dictionaries
  • and call that key on all the dictionary in the list.

list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3}]

# prepare a list/set of all the keys from all the dictionaries

# method 1: use sum 
all_keys = sum([[a for a in x.keys()] for x in list_of_dict], [])

# method 2: use itertools 
import itertools
all_keys = list(itertools.chain.from_iterable(list_of_dict))

print(all_keys)
# ['a', 'b', 'c', 'a', 'b', 'k', 'j', 'a', 'k', 'd', 'a']

# convert the list to set to remove duplicates 
all_keys = set(all_keys)
print(all_keys)
# {'a', 'k', 'c', 'd', 'b', 'j'}

# now merge the dictionary
merged = {k: [d.get(k) for d in list_of_dict if k in d] for k in all_keys}
print(merged)
# {'a': [1, 3, 3, 3], 'k': [5, 5], 'c': [3], 'd': [4], 'b': [2, 5], 'j': [5]}

In short:

all_keys = set(sum([[a for a in x.keys()] for x in list_of_dict], []))
merged = {k: [d.get(k) for d in list_of_dict if k in d] for k in all_keys}

print(merged)
# {'a': [1, 3, 3, 3], 'k': [5, 5], 'c': [3], 'd': [4], 'b': [2, 5], 'j': [5]}

CodePudding user response:

items() is a dictionary method, but list_of_dict is a list. You need a nested loop so you can loop over the dictionaries and then loop over the items of each dictionary.

ou = {}
for d in list_of_dict:
    for key, value in d.items():
        output.setdefault(key, []).append(value)

CodePudding user response:

another shorten version can be,

list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3}]

output = {
    k: [d[k] for d in list_of_dict if k in d]
    for k in set().union(*list_of_dict)
}
print(output)
{'d': [4], 'k': [5, 5], 'a': [1, 3, 3, 3], 'j': [5], 'c': [3], 'b': [2, 5]}
  •  Tags:  
  • Related