I have got a 2D array a with size 2, 1403 and a list b which has 2 list.
a.shape = (2, 1403) # a is 2D array, each row has got unique elements.
len(b) = 2 # b is list
len(b[0]), len(b[1]) = 415, 452 # here also both the list inside b has got unique elements
all the elements present in b[0] and b[1] is present in a[0] and a[1] respectively
Now i want to rearrange elements of a based on elements of b. I want to rearrange such that all the elements in b[0] which is also present in a[0] should come in the ending of a[0], meaning new a should be such that a[0][:-len(b[0])] = b[0], similarly a[1][:-len(b[1])] = b[1].
Toy Example
a has got elements like [[1,2,3,4,5,6,7,8,9,10,11,12],[1,2,3,4,5,6,7,8,9,10,11,12]
b has got elements like [[5, 9, 10], [2, 6, 8, 9, 11]]
new_a becomes [[1,2,3,4,6,7,8,11,12,5,9,10], [1,3,4,5,7,10,12,2,6,8,9,11]]
I have written a code which loops over all the element which becomes very slow, it's shown below
a_temp = []
remove_temp = []
for i, array in enumerate(a):
a_temp_inner = []
remove_temp_inner = []
for element in array:
if element not in b[i]:
a_temp_inner.append(element) # get all elements first which are not present in b
else:
remove_temp_inner.append(element) #if any element present in b, remove it from main array
a_temp.append(a_temp_inner)
remove_temp.append(b_temp_inner)
a_temp = torch.tensor(a_temp)
remove_temp = torch.tensor(remove_temp)
a = torch.cat((a_temp, remove_temp), dim = 1)
Can anyone please help me with some faster implementation that works better than this
CodePudding user response:
Here is my approach:
index_ = np.array([[False if i in d else True for i in c] for c, d in zip(a,b)])
arr_filtered =[[np.extract(ind, c) for c, d, ind in zip(a,b,index_)], [np.extract(np.logical_not(ind), c) for c, d, ind in zip(a,b, index_)]]
arr_final = ar = np.array([np.concatenate((i, j)) for i, j in zip(*arr_filtered)])
CodePudding user response:
Assuming a is a np.array, b is a list you can use
np.array([np.concatenate((i[~np.in1d(i, j)], j)) for i, j in zip(a,b)])
Output
array([[ 1, 2, 3, 4, 6, 7, 8, 11, 12, 5, 9, 10],
[ 1, 3, 4, 5, 7, 10, 12, 2, 6, 8, 9, 11]])
Can be micro-optimized if b contains empty lists
np.array([np.concatenate((i[~np.in1d(i, j)], j)) if j else i for i, j in zip(a,b)])
In my benchmarks, for np.arrays with less than ~100 elements converting .tolist() is faster than np.concatenate
np.array([i[~np.in1d(i, j)].tolist() j for i, j in zip(a,b)])
Data example and imports for this solution
import numpy as np
a = np.array([
[1,2,3,4,5,6,7,8,9,10,11,12],
[1,2,3,4,5,6,7,8,9,10,11,12]
])
b = [[5, 9, 10],
[2, 6, 8, 9, 11]]
