Home > Net >  Fastest way to split list into multiple sublists based on several conditions
Fastest way to split list into multiple sublists based on several conditions

Time:01-20

What is the fastest way to split a list into multiple sublists based on conditions? Each condition represents a separate sublist.

One way to split a listOfObjects into sublists (three sublists for demonstration, but more are possible):

listOfObjects = [.......]
l1, l2, l3 = [], [], []
for l in listOfObjects:
    if l.someAttribute == "l1":
        l1.append(l)
    elif l.someAttribute == "l2":
        l2.append(l)
    else:
        l3.append(l)

This way does not seem pythonic at all and also takes quite some time. Are there faster approaches, e.g. using map?

Similar question, but with only two conditions and no statement about speed.

CodePudding user response:

You could collections.defaultdict here for mapping.

from collections import defaultdict

d = defaultdict(list)

for l in listOfObjects:
    d[l.someAttribute].append(l)

out = d.values() 
l1 , l2, l3 = d['l1'], d['l2'], d['l3']

d would be of the form.

{ 
  attr1 : [...],
  attr2 : [...],
  ...
  attrn : [...]
}

CodePudding user response:

Omg, that similar question's answer is amazing. I haven't thought about that for splitting... Anyway, you can do something similar but it would be less readable:

for l in listOfObjects:
    (l3, l2, l1)[(l.someAttribute == "l1")*2 or l.someAttribute == "l2"].append(l)

This will work for any boolean conditions. or returns first truthy value (or False). True==1, so we add *2 for the index that we want to be equal to 2.

But as I said, it's not really readable. And not scalable.

As for speed: or is short-circuiting, returns first truthy value, so the check of conditions should be similar to your approach. You might want to keep the lookup tuple defined outside of the loop.


And more readable thing using dict because your conditions are based on equality (note: attribute you want also has to be hashable)

lookup = {"l1": l1, "l2": l2}
for l in listOfObjects:
    lookup.get(l.someAttribute, l3).append(l)

dict.get gets default value as second - so it's perfect for our else catchall.

In terms of speed: Dictionary lookup will have only one check, as opposed to a chain of or conditions of chain of ifs

  •  Tags:  
  • Related