filter out values from a given numpy array-CodePudding

A program writes the following 2D array which contains the column indeces of another array that needs to be built:

A = np.array([[   7,    0 ,   6 ,   0 ,   4 ,   0 ,   9,   0, 7215, 7215],\
[   1,    8,    1,    2,    1,    9,    1,    3, 7215, 7215],
[   1 ,   5 ,   1,    8, 7215, 7215, 7215, 7215, 7215, 7215],
[   1  ,  8  ,  1,    9, 7215, 7215, 7215, 7215, 7215, 7215],
[   9   , 0 ,   8,    9, 7215, 7215, 7215, 7215, 7215, 7215],
[   2 ,   6 ,   8,    7, 7215, 7215, 7215, 7215, 7215, 7215],
[   5  ,  0 ,   7,    0, 7215, 7215, 7215, 7215, 7215, 7215],
[   8 ,   0 ,   5,    6, 7215, 7215, 7215, 7215, 7215, 7215],
[   1 ,   7,    2,    5,    3,    9,    9,    4, 7215, 7215],
[   1 ,   4,    3,    8,    8,    0,    4,    0, 7215, 7215]], dtype=int)
print(A)

The filter must fulfill 2 conditions:

no column index for the main diagonal, where column index = row index
column repetition is allowed only for column index larger than 0 or 1

The large number in the matrix, 7215 is the value I used to initialize the matrix. It has no other purpose for the project.

Therefore, what I need is a code that computes something like this:

# row 0: 7, 6, 4, 9
# row 1: 8, 2, 9, 3
# row 2: 1, 5, 8
# row 3: 1, 8, 9
# row 4: 9, 0, 8, 9
# row 5: 2, 6, 8, 7
# row 6: 5, 0, 7
# row 7: 8, 0, 5, 6
# row 8: 1, 7, 2, 5, 3, 9, 9, 4
# row 9: 1, 4, 3, 8, 8, 0, 4

I believe the first requirement can be satisfied as follows:

tl = A.shape
l = tl[0]
B = np.full((l,l), 7215)
for i in range(l):
        for j in range(l):
                if A[i][j] != i:
                        B[i][j] = A[i][j]

but I do not know how to satisfy the next requirement, although the following code seems promising:

for i in range(l):

        C, counts = np.unique(B[i][:], return_counts=True)
        print(C)

CodePudding user response：

Edit: Missed that you don't want duplicates of 0 and 1. This fixes that in your A matrix.

def remove_after_first(arr, val):
    idx = np.where(arr == val)
    drop_idx = idx[0][1:] == idx[0][:-1]
    arr[idx[0][1:][drop_idx], idx[1][1:][drop_idx]] = 7215

remove_after_first(A, 0)
remove_after_first(A, 1)

Make a 2d array B where the first column is row index and the second column is the value from A, using np.where to get rid of the NA value (of 7215?).

B = np.where(A != 7215)
B = np.hstack((B[0].reshape(-1, 1), A[B].reshape(-1, 1)))

Remove any cases where the row and column are equal.

B = B[B[:, 0] != B[:, 1]]

If you really want to tear them into a ragged list of arrays based on row you can do that pretty easily. The row/col indices are usually more useful though.

B_list = [B[B[:, 0] == i, 1] for i in range(A.shape[0])]

>>> B_list

[array([7, 6, 4, 9]),
 array([8, 2, 9, 3]),
 array([1, 5, 8]),
 array([1, 8, 9]),
 array([9, 0, 8, 9]),
 array([2, 6, 8, 7]),
 array([5, 0, 7]),
 array([8, 0, 5, 6]),
 array([1, 7, 2, 5, 3, 9, 9, 4]),
 array([1, 4, 3, 8, 8, 0, 4])]