Home > OS >  How to zip two lists with duplicate elements?
How to zip two lists with duplicate elements?

Time:02-04

There are two lists

l1 = ['k1','k2','k3','k3','k4', 'k5']
l2 = ["1.2.3","abc-2.3.4","xyz-def-5.6.8", "xyz-def-5.6.7","ghjb-5.6.7","7.8.9"]

I need to get these items as key:value pair in a dictionary, along with highest value of duplicate elements. Since dictionary holds the unique keys, one of the duplicate elements will be overridden.

print(dict(zip(l1, l2)))
{'k1': '1.2.3', 'k2': 'abc-2.3.4', 'k3': 'xyz-def-5.6.7', 'k4': 'ghjb-5.6.7', 'k5': '7.8.9'}

but from above output, i need highest value xyz-def-5.6.8 instead of xyz-def-5.6.7

Tried, print(list(zip(l1, l2))), output as below

[('k1', '1.2.3'), ('k2', 'abc-2.3.4'), ('k3', 'xyz-def-5.6.8'), ('k3', 'xyz-def-5.6.7'), ('k4', 'ghjb-5.6.7'), ('k5', '7.8.9')]

How do I achieve it ?

Is it possible to format this list of tuples or any other way to get desired ?

l1 = ['k1','k2','k3','k3','k4', 'k5', 'k6', 'k7', 'k6']
l2 = ["1.2.3","abc-2.3.4","xyz-def-5.6.8", "xyz-def-5.6.7","ghjb-5.6.7","7.8.9", "1:2.3.4-3ubuntu0.1", "1.2.3-1.2build3", "1:2.3.4-3ubuntu0.2"]

These can't be same format across all the keys but it can be same across the certain duplicate keys, Say k6 has one format, k3 has another format.

CodePudding user response:

You need some way to "tell" your dict which value to choose if the key already exists - and it has to know how to decide between two values.

i need highest value xyz-def-5.6.8 instead of xyz-def-5.6.7

The provided function prioritize implements that.

You could f.e. do this:

l1 = ['k1','k2','k3','k3','k4', 'k5']
l2 = ["1.2.3","abc-2.3.4","xyz-def-5.6.8", "xyz-def-5.6.7","ghjb-5.6.7","7.8.9"]

def prioritize(a,b):
    """Split the data by -, take the last, split it by . and convert to int tuple
    for comparison reasons. Take either a or b depending wich is bigger."""
    def extract(what):
        """Split into int tuples"""
        return tuple(map(int, (what.split("-")[-1]).split(".")))

    # 'xyz-def-5.6.8' => (5,6,8)
    a_num = extract(a)

    # 'xyz-def-5.5.7 => (5,5,7)
    b_num = extract(b)

    # int tuple comparison "just works"
    return a if a_num > b_num else b

d = {}
for (k,v) in zip(l1,l2):
    # maybe keep old value, else use new value
    d[k] = prioritize(d.get(k,v), v)

print(d)

Output:

{'k1': '1.2.3', 
 'k2': 'abc-2.3.4', 
 'k3': 'xyz-def-5.6.8', 
 'k4': 'ghjb-5.6.7', 
 'k5': '7.8.9'}

CodePudding user response:

l1 = ['k1','k2','k3','k3','k4', 'k5', 'k6', 'k7', 'k6']
l2 = ["1.2.3","abc-2.3.4","xyz-def-5.6.8", "xyz-def-5.6.7","ghjb-5.6.7","7.8.9", "1:2.3.4-3ubuntu0.1", "1.2.3-1.2build3", "1:2.3.4-3ubuntu0.2"]
l3 = {}

def get_duplicates_details(list_of_elems):
    test = {}
    for index, value in enumerate(list_of_elems):
        if value in test:
            test[value].append(index)
        else:
            test[value] = [index]

    dictOfElems = {key: value for key, value in test.items() if len(value) > 1}
    return dictOfElems

dictOfElems = get_duplicates_details(l1)
print(dictOfElems)

for index2, value2 in enumerate(l1):
    if value2 in dictOfElems:
        tmp = [l2[j] for j in dictOfElems[value2]]
        tmp.sort()
        l3[value2] = tmp[-1]
    else:
        l3[value2] = l2[index2]

print(l3)

Output:

{'k3': [2, 3], 'k6': [6, 8]}
{'k1': '1.2.3', 'k2': 'abc-2.3.4', 'k3': 'xyz-def-5.6.8', 'k4': 'ghjb-5.6.7', 'k5': '7.8.9', 'k6': '1:2.3.4-3ubuntu0.2', 'k7': '1.2.3-1.2build3'}
  •  Tags:  
  • Related