Home > Blockchain >  want to join arrays horizontally by condition numpy
want to join arrays horizontally by condition numpy

Time:02-01

want to join array such that if last column of b matches first column of a then match those rows e.g row 0 in b matches with row 0 in a then join them horizontally .

a=np.array([[17., 46.],
            [21., 46.],
            [46., 54.]])

b=np.array([[ 3., 17.],
            [12., 17.],
            [ 3., 21.],
            [17., 46.],
            [21., 46.],
            [26., 46.],
            [34., 46.],
            [39., 46.]])

i am doing this . it returns the index but dont know what to do next

ndx = np.searchsorted(b[:,1],a[:,0])
print(ndx) # [0 2 3]


output must be

[ 3., 17. , 17, 46.],
[12., 17. , 17, 46.],
[ 3., 21. , 21 , 46],
[17., 46. , 46 , 54],
[21., 46. , 46 , 54],
[26., 46. , 46 , 54],
[34., 46. , 46 , 54],
[39., 46. , 46 , 54]])

CodePudding user response:

In [61]: x=b[:,1]
In [64]: u,i,c=np.unique(x,return_counts=True,return_inverse=True)
In [65]: u
Out[65]: array([17., 21., 46.])
In [66]: i
Out[66]: array([0, 0, 1, 2, 2, 2, 2, 2])
In [67]: c
Out[67]: array([2, 1, 5])

Using inverse array to replicate rows of a:

In [69]: a[i]
Out[69]: 
array([[17., 46.],
       [17., 46.],
       [21., 46.],
       [46., 54.],
       [46., 54.],
       [46., 54.],
       [46., 54.],
       [46., 54.]])

Join them with hstack:

In [70]: np.hstack((b,a[i]))
Out[70]: 
array([[ 3., 17., 17., 46.],
       [12., 17., 17., 46.],
       [ 3., 21., 21., 46.],
       [17., 46., 46., 54.],
       [21., 46., 46., 54.],
       [26., 46., 46., 54.],
       [34., 46., 46., 54.],
       [39., 46., 46., 54.]])

We could also use the counts:

In [72]: np.repeat(a,c,axis=0)
Out[72]: 
array([[17., 46.],
       [17., 46.],
       [21., 46.],
       [46., 54.],
       [46., 54.],
       [46., 54.],
       [46., 54.],
       [46., 54.]])

All this assumes the matching values are in the same order in the two arrays.

Applying unique to a may allow us to generalize this to the case where the order doesn't match.

CodePudding user response:

You can use pandas:

import numpy as np
import pandas as pd

a=np.array([[17., 46.],
            [21., 46.],
            [46., 54.]])

b=np.array([[ 3., 17.],
            [12., 17.],
            [ 3., 21.],
            [17., 46.],
            [21., 46.],
            [26., 46.],
            [34., 46.],

pd.merge(pd.DataFrame(b, columns=['a', 'b']),
         pd.DataFrame(a, columns=['c', 'd']),
         how='inner',
         left_on='b',
         right_on='c').values

It gives:

array([[ 3., 17., 17., 46.],
       [12., 17., 17., 46.],
       [ 3., 21., 21., 46.],
       [17., 46., 46., 54.],
       [21., 46., 46., 54.],
       [26., 46., 46., 54.],
       [34., 46., 46., 54.],
       [39., 46., 46., 54.]])
  •  Tags:  
  • Related