rearrange an array element based on list in python-CodePudding

I have got a 2D array a with size 2, 1403 and a list b which has 2 list.

a.shape = (2, 1403) # a is 2D array, each row has got unique elements.

len(b) = 2 # b is list

len(b[0]), len(b[1]) = 415, 452 # here also both the list inside b has got unique elements

all the elements present in b[0] and b[1] is present in a[0] and a[1] respectively

Now i want to rearrange elements of a based on elements of b. I want to rearrange such that all the elements in b[0] which is also present in a[0] should come in the ending of a[0], meaning new a should be such that a[0][:-len(b[0])] = b[0], similarly a[1][:-len(b[1])] = b[1].

Toy Example

a has got elements like [[1,2,3,4,5,6,7,8,9,10,11,12],[1,2,3,4,5,6,7,8,9,10,11,12]

b has got elements like [[5, 9, 10], [2, 6, 8, 9, 11]]

new_a becomes [[1,2,3,4,6,7,8,11,12,5,9,10], [1,3,4,5,7,10,12,2,6,8,9,11]]

I have written a code which loops over all the element which becomes very slow, it's shown below

a_temp = []
remove_temp = []
for i, array in enumerate(a):
    a_temp_inner = []
    remove_temp_inner = []
    for element in array:
        if element not in b[i]:
            a_temp_inner.append(element) # get all elements first which are not present in b
        else:
            remove_temp_inner.append(element) #if any element present in b, remove it from main array

    a_temp.append(a_temp_inner)
    remove_temp.append(b_temp_inner)

a_temp = torch.tensor(a_temp)
remove_temp = torch.tensor(remove_temp)
a = torch.cat((a_temp, remove_temp), dim = 1)

Can anyone please help me with some faster implementation that works better than this

CodePudding user response：

Here is my approach:

index_ = np.array([[False if i in d else True for i in c] for c, d in zip(a,b)])
arr_filtered =[[np.extract(ind, c) for c, d, ind in zip(a,b,index_)], [np.extract(np.logical_not(ind), c) for c, d, ind in zip(a,b, index_)]]
arr_final = ar = np.array([np.concatenate((i, j)) for i, j in zip(*arr_filtered)])

CodePudding user response：

Assuming a is a np.array, b is a list you can use

np.array([np.concatenate((i[~np.in1d(i, j)], j)) for i, j in zip(a,b)])

Output

array([[ 1,  2,  3,  4,  6,  7,  8, 11, 12,  5,  9, 10],
       [ 1,  3,  4,  5,  7, 10, 12,  2,  6,  8,  9, 11]])

Can be micro-optimized if b contains empty lists

np.array([np.concatenate((i[~np.in1d(i, j)], j)) if j else i for i, j in zip(a,b)])

In my benchmarks, for np.arrays with less than ~100 elements converting .tolist() is faster than np.concatenate

np.array([i[~np.in1d(i, j)].tolist()   j for i, j in zip(a,b)])

Data example and imports for this solution

import numpy as np

a = np.array([
        [1,2,3,4,5,6,7,8,9,10,11,12],
        [1,2,3,4,5,6,7,8,9,10,11,12]
    ])
b = [[5, 9, 10],
     [2, 6, 8, 9, 11]]