removing list from nested list if condition true-CodePudding

I have a list of travel stops in chronological order:

travel = [[start, stop, mode_of_transport], ...]

Some of them are by boat (mode_of_transport == 'B').

Unfortunately boats may stop at several ports before you need to leave it - those intermediate stops are irrelevant to me.

Therefore I want to merge multiple consecutive boat stops into one - keeping the start-port of the first 'B' and the stop-port (and mode_of_transportation) of the last port - only for consecutive 'B''s in my travel data.

I have wrote a very big function to basically remove part of nested list if condition become true that is too big/slow:

def remove_boat_seg(data):
    # get boat runs to collapse boat segments
    log = []
    description = data
    print(description)
    b_inds=[i for i in range(len(description)) if description[i][2]=='B']
    print("bind",b_inds)
    b_sets=[]
    tested=set()
    for i in b_inds:
        if i in tested:
            continue
        else:
            tset=[i]
            for b in b_inds:
                if b-1 in tset:
                    tset.append(b)
                b_sets.append(tset)
            tested=tested|set(tset)
    b_segs={}
    b_keys=[]
    for b in b_sets:
        bmin=min(b)
        bmax=max(b)
        b_segs[bmin]=[description[bmin][0],description[bmax][1],'B']
        b_keys.append(bmin)
    tdscr=[]
    for i in range(len(description)):
        if i not in b_inds:
            tdscr.append(description[i])
        elif i not in b_keys:
            continue
        else:
            tdscr.append(b_segs[i])
            msg=b_segs[i]
            if len(msg)>1:
                log.append(msg)
    description=tdscr
    return description

Explanation/Output

#output example 1
remove_boat_seg(data=[['40635', 'TRGEB', 'B'], ['TRGEB', 'CNSHG', 'B'], ['CNSHG', 'DNTS', 'B']])

[['40635', 'DNTS', 'B']]

#output example 2
remove_boat_seg(data=[['3786', 'AEJEA', 'B'], ['AEJEA', 'GBLGP', 'B'], ['GBLGP', 'USMSY', 'B'], ['USMSY', 'DNTS', 'K'], ['DNTS', 'NEWORL_LA', 'K'], ['NEWORL_LA', 'LBEACH_CA', 'K'], ['LBEACH_CA', 'USLGB', 'K'], ['USLGB', 'CNSHG', 'B'], ['CNSHG', 'EJ00', 'K'], ['EJ00', '354B', 'B']])

[['3786', 'USMSY', 'B'], ['USMSY', 'DNTS', 'K'], ['DNTS', 'NEWORL_LA', 'K'], ['NEWORL_LA', 'LBEACH_CA', 'K'], ['LBEACH_CA', 'USLGB', 'K'], ['USLGB', 'CNSHG', 'B'], ['CNSHG', 'EJ00', 'K'], ['EJ00', '354B', 'B']]

#last element of all lists is mode_of_transport #each list in nested list is segment of complete route

explanation output1:

Look for all the segment nested list have last element = "B"
store the instance no.
If all lists are having B, take the very first element of the 1st list and then 2nd element from the last list and add the third element as B and return. (i.e, shorting the route) (I am checking using indices define as b_inds)

explanation output2:

by considering above aspects
compare mode of first segment with next segment
we have to check are there any segment where B mode is repeating (B is boat) if it is repeating then we have to take the very first element of that segment and the 2nd and 3rd element of last segment where after that segment_mode changes.
Like in this case output 2 as you can see indices 0,1,2 are same right and 3rd one is not B so, for 1st segment pick 1st element 3786 as it is and last element of that list where next segment mode is not B i.e, USMSY
And then you'll see there are again segment with mode B in last and 3rd last place. There we are not doing anything because their next segment is not B

I want to reduce this function, want to make it very short. As there is time constraint for me.

CodePudding user response：

You can group your data by their transport_mode and for consecutive stretches of 'B' modify them:

from itertools import groupby

# testcase 2
data = [['3786', 'AEJEA', 'B'], ['AEJEA', 'GBLGP', 'B'], ['GBLGP', 'USMSY', 'B'], 
        ['USMSY', 'DNTS', 'K'], ['DNTS', 'NEWORL_LA', 'K'], ['NEWORL_LA', 'LBEACH_CA', 'K'],
        ['LBEACH_CA', 'USLGB', 'K'], ['USLGB', 'CNSHG', 'B'], ['CNSHG', 'EJ00', 'K'], 
        ['EJ00', '354B', 'B']]
 
result = []

# group by the last element ("transport_mode") of inner elements
# and combine consecutive "B"oat stops into one
for grp,val in groupby (data, lambda x:x[-1]):
    # if B create one merged elements
    if grp == "B":
        vals = list(val)
        # use the first B's 1st value and
        # 2nd to last value of last B
        r = [vals[0][0], *vals[-1][1:]]
        result.append(r)

    # take all elements
    else:
        result.extend(val)

print (result)
# print your current result for testcase 2
print( [['3786', 'USMSY', 'B'], ['USMSY', 'DNTS', 'K'], ['DNTS', 'NEWORL_LA', 'K'], 
        ['NEWORL_LA', 'LBEACH_CA', 'K'], ['LBEACH_CA', 'USLGB', 'K'],
        ['USLGB', 'CNSHG', 'B'], ['CNSHG', 'EJ00', 'K'], ['EJ00', '354B', 'B']])

Output:

[['3786', 'USMSY', 'B'], ['USMSY', 'DNTS', 'K'], ['DNTS', 'NEWORL_LA', 'K'], ['NEWORL_LA', 'LBEACH_CA', 'K'], ['LBEACH_CA', 'USLGB', 'K'], ['USLGB', 'CNSHG', 'B'], ['CNSHG', 'EJ00', 'K'], ['EJ00', '354B', 'B']]
[['3786', 'USMSY', 'B'], ['USMSY', 'DNTS', 'K'], ['DNTS', 'NEWORL_LA', 'K'], ['NEWORL_LA', 'LBEACH_CA', 'K'], ['LBEACH_CA', 'USLGB', 'K'], ['USLGB', 'CNSHG', 'B'], ['CNSHG', 'EJ00', 'K'], ['EJ00', '354B', 'B']]