Home > Software design >  Replacing chunks of numpy array on condition
Replacing chunks of numpy array on condition

Time:01-20

Let's say I have the following numpy array, of 1's and 0's exclusively:

import numpy as np

example = np.array([0,1,1,0,1,0,0,1,1], dtype=np.uint8)

I want to group all elements into chunks of 3, and replace the chunks by a single value, based on a condition. Let's say I want [0,1,1] to become 5, and [0,1,0] to become 10. Thus the desired output would be:

[5,10,5]

All possible combinations of 1's and 0's in a chunk have a corresponding unique value that should replace the chunk. What's the fastest way to do this?

CodePudding user response:

I suggest you reshape your array in to a 3 by something array. Now we can see each row as a binary number that's an index into a list of values that you want. You convert it to that number and index into the values.

arr = np.array([0,1,1,0,1,0,0,1,1], dtype=np.uint8).reshape(-1,3)

idx = 2**0*arr[:,0] 2**1*arr[:,1] 2**2*arr[:,2]

values = np.zeros(2**3)
values[0 *2**0  1 *2**1  1 *2**2] = 5
values[0 *2**0  1 *2**1  0 *2**2] = 10

values[idx]

this gives

array([ 5., 10.,  5.])

Or if you prefer to write the conversion more succinctly though a bit less elementary (thanks to @mozway for the idea):

def bin_vect_to_int(arr):
    bin_units = 2**np.arange(arr.shape[1])
    return np.dot(arr,bin_units)


arr = np.array([0,1,1,0,1,0,0,1,1,0,1,1], dtype=np.uint8).reshape(-1,3)
idx = binVecToInt(arr)

values = np.zeros(2**3)
values[bin_vect_to_int(np.array([[0,1,1]]))] = 5
values[bin_vect_to_int(np.array([[0,1,0]]))] = 10

values[idx]

CodePudding user response:

You can use contiguous array view of shape(3, -1), find locations of unique occurences and replace them in these locations:

def view_ascontiguous(a): # a is array
    a = np.ascontiguousarray(a)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel()

def replace(x, args, subs, viewer):
    u, inv = np.unique(viewer(x), return_inverse=True)
    idx = np.searchsorted(viewer(args), u)
    return subs[idx][inv]

>>> replace(x=np.array([1, 0, 1, 0, 0, 1, 1, 0, 1]).reshape(-1, 3),
        args=np.array([[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]]),
        subs=np.array([ 5, 57, 58, 44, 67, 17, 77,  1]),
        viewer=view_ascontiguous)
array([17, 57, 17])

Now, fun part: you can use your own viewer. It is required to map an array you pass in args to any kind of ascending indices like so:

viewer=lambda arr: np.ravel_multi_index(arr.T, (2,2,2)) #0, 1, 2, 3, 4, 5, 6, 7
viewer=lambda arr: np.sum(arr * [4, 2, 1], axis=1) #0, 1, 2, 3, 4, 5, 6, 7
viewer=lambda arr: np.dot(arr, [4, 2, 1]) #0, 1, 2, 3, 4, 5, 6, 7

Or even more fun:

viewer=lambda arr: 2*np.dot(arr, [4, 2, 1])   1 #1, 3, 5, 7, 9, 11, 13, 15
viewer=lambda arr: np.vectorize(chr)(97 np.dot(arr, [4, 2, 1])) #a b c d e f g h

since you could also map

[[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]]

to any ascending sequence you could think of like [1, 3, 5, 7, 9, 11, 13, 15] or ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'] and the result remains the same.

CodePudding user response:

As the other answers indicated, you can start by reshaping your array (actually, you should probably generate it with the correct shape to begin with, but that's another issue):

example = np.array([0, 1, 1, 0, 1, 0, 0, 1, 1], dtype=np.uint8)
data = example.reshape(-1, 3)

Now running a custom python function over the array is going to be slow, but luckily numpy has your back here. You can use np.packbits to transform each row into a number directly:

data = np.packbits(data, axis=1, bitorder='little').ravel() # [6, 2, 6]

If you wanted 101 to map to 5 and 110 to map to 6, your job is done. Otherwise, you will need to come up with a mapping. Since you have three bits, you only need 8 numbers in the mapping array:

mapping = np.array([7, 4, 3, 8, 124, 1, 5, 0])

You can use data as an index directly into mapping. The output will have the type of mapping but the shape of data:

result = mapping[data]  # [5, 3, 5]

You can do this in one line:

mapping[np.packbits(example.reshape(-1, 3), axis=1, bitorder='little').ravel()]
  •  Tags:  
  • Related