let's assume these are my lists:
oracle_files = [
(1, "__init__.py"),
(2, "price_calc.py"),
(3, "lang.py")]
predicted_files = [
(5, ["random.py","price_calc.py"]),
(2, ["__init__.py","price_calc.py"]),
(1, ["lang.py","__init__.py"])]
first list is a list of tuples where i have an identifier and a string per each. second one is a list of tuples of integers and list of strings
my intention is to create a third list that intersects these two ones by ID (the integer)
and the output should look like this:
result = [(2, "price_calc.py", ["__init__.py","price_calc.py"]),
(1, "__init__.py", ["lang.py","__init__.py"])]
do you know a way to reach this output? because i'm not getting it right.
CodePudding user response:
Here's an approach using dict:
oracle_files = [(1, "__init__.py"), (2, "price_calc.py"), (3, "lang.py")]
predicted_files = [(5, ["random.py","price_calc.py"]), (2, ["__init__.py","price_calc.py"]), (1, ["lang.py","__init__.py"])]
dct1 = dict(oracle_files)
dct2 = dict(predicted_files)
result = [(k, dct1[k], dct2[k]) for k in dct1.keys() & dct2.keys()]
print(result) # [(1, '__init__.py', ['lang.py', '__init__.py']), (2, 'price_calc.py', ['__init__.py', 'price_calc.py'])]
This uses a convenient fact that the dict keys obtained from dict.keys() behave like a set.
Keys views are set-like since their entries are unique and hashable. [...] For set-like views, all of the operations defined for the abstract base class collections.abc.Set are available (for example,
==,<, or^).https://docs.python.org/3/library/stdtypes.html#dictionary-view-objects
CodePudding user response:
I think this does what you want.
oracle_files = [(1, "__init__.py"), (2, "price_calc.py"), (3, "lang.py")]
predicted_files = [(5, ["random.py","price_calc.py"]), (2, ["__init__.py","price_calc.py"]), (1, ["lang.py","__init__.py"])]
dct = dict(oracle_files)
for k,v in predicted_files:
if k in dct:
dct[k] = (dct[k], v)
print(dct)
outlist = [(k,) v for k,v in dct.items() if isinstance(v,tuple)]
print(outlist)
Output:
{1: ('__init__.py', ['lang.py', '__init__.py']), 2: ('price_calc.py', ['__init__.py', 'price_calc.py']), 3: 'lang.py'}
[(1, '__init__.py', ['lang.py', '__init__.py']), (2, 'price_calc.py', ['__init__.py', 'price_calc.py'])]
