Replacing chunks of numpy array on condition-CodePudding

Let's say I have the following numpy array, of 1's and 0's exclusively:

import numpy as np

example = np.array([0,1,1,0,1,0,0,1,1], dtype=np.uint8)

I want to group all elements into chunks of 3, and replace the chunks by a single value, based on a condition. Let's say I want [0,1,1] to become 5, and [0,1,0] to become 10. Thus the desired output would be:

[5,10,5]

All possible combinations of 1's and 0's in a chunk have a corresponding unique value that should replace the chunk. What's the fastest way to do this?

CodePudding user response：

I suggest you reshape your array in to a 3 by something array. Now we can see each row as a binary number that's an index into a list of values that you want. You convert it to that number and index into the values.

arr = np.array([0,1,1,0,1,0,0,1,1], dtype=np.uint8).reshape(-1,3)

idx = 2**0*arr[:,0] 2**1*arr[:,1] 2**2*arr[:,2]

values = np.zeros(2**3)
values[0 *2**0  1 *2**1  1 *2**2] = 5
values[0 *2**0  1 *2**1  0 *2**2] = 10

values[idx]

this gives

array([ 5., 10.,  5.])

Or if you prefer to write the conversion more succinctly though a bit less elementary (thanks to @mozway for the idea):

def bin_vect_to_int(arr):
    bin_units = 2**np.arange(arr.shape[1])
    return np.dot(arr,bin_units)


arr = np.array([0,1,1,0,1,0,0,1,1,0,1,1], dtype=np.uint8).reshape(-1,3)
idx = binVecToInt(arr)

values = np.zeros(2**3)
values[bin_vect_to_int(np.array([[0,1,1]]))] = 5
values[bin_vect_to_int(np.array([[0,1,0]]))] = 10

values[idx]

CodePudding user response：

You can use contiguous array view of shape(3, -1), find locations of unique occurences and replace them in these locations:

def view_ascontiguous(a): # a is array
    a = np.ascontiguousarray(a)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel()

def replace(x, args, subs, viewer):
    u, inv = np.unique(viewer(x), return_inverse=True)
    idx = np.searchsorted(viewer(args), u)
    return subs[idx][inv]

>>> replace(x=np.array([1, 0, 1, 0, 0, 1, 1, 0, 1]).reshape(-1, 3),
        args=np.array([[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]]),
        subs=np.array([ 5, 57, 58, 44, 67, 17, 77,  1]),
        viewer=view_ascontiguous)
array([17, 57, 17])

Now, fun part: you can use your own viewer. It is required to map an array you pass in args to any kind of ascending indices like so:

viewer=lambda arr: np.ravel_multi_index(arr.T, (2,2,2)) #0, 1, 2, 3, 4, 5, 6, 7
viewer=lambda arr: np.sum(arr * [4, 2, 1], axis=1) #0, 1, 2, 3, 4, 5, 6, 7
viewer=lambda arr: np.dot(arr, [4, 2, 1]) #0, 1, 2, 3, 4, 5, 6, 7

Or even more fun:

viewer=lambda arr: 2*np.dot(arr, [4, 2, 1])   1 #1, 3, 5, 7, 9, 11, 13, 15
viewer=lambda arr: np.vectorize(chr)(97 np.dot(arr, [4, 2, 1])) #a b c d e f g h

since you could also map

[[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]]

to any ascending sequence you could think of like [1, 3, 5, 7, 9, 11, 13, 15] or ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'] and the result remains the same.

CodePudding user response：

As the other answers indicated, you can start by reshaping your array (actually, you should probably generate it with the correct shape to begin with, but that's another issue):

example = np.array([0, 1, 1, 0, 1, 0, 0, 1, 1], dtype=np.uint8)
data = example.reshape(-1, 3)

Now running a custom python function over the array is going to be slow, but luckily numpy has your back here. You can use np.packbits to transform each row into a number directly:

data = np.packbits(data, axis=1, bitorder='little').ravel() # [6, 2, 6]

If you wanted 101 to map to 5 and 110 to map to 6, your job is done. Otherwise, you will need to come up with a mapping. Since you have three bits, you only need 8 numbers in the mapping array:

mapping = np.array([7, 4, 3, 8, 124, 1, 5, 0])

You can use data as an index directly into mapping. The output will have the type of mapping but the shape of data:

result = mapping[data]  # [5, 3, 5]

You can do this in one line:

mapping[np.packbits(example.reshape(-1, 3), axis=1, bitorder='little').ravel()]