Let's say I have the following numpy array, of 1's and 0's exclusively:
import numpy as np
example = np.array([0,1,1,0,1,0,0,1,1], dtype=np.uint8)
I want to group all elements into chunks of 3, and replace the chunks by a single value, based on a condition. Let's say I want [0,1,1] to become 5, and [0,1,0] to become 10. Thus the desired output would be:
[5,10,5]
All possible combinations of 1's and 0's in a chunk have a corresponding unique value that should replace the chunk. What's the fastest way to do this?
CodePudding user response:
I suggest you reshape your array in to a 3 by something array. Now we can see each row as a binary number that's an index into a list of values that you want. You convert it to that number and index into the values.
arr = np.array([0,1,1,0,1,0,0,1,1], dtype=np.uint8).reshape(-1,3)
idx = 2**0*arr[:,0] 2**1*arr[:,1] 2**2*arr[:,2]
values = np.zeros(2**3)
values[0 *2**0 1 *2**1 1 *2**2] = 5
values[0 *2**0 1 *2**1 0 *2**2] = 10
values[idx]
this gives
array([ 5., 10., 5.])
Or if you prefer to write the conversion more succinctly though a bit less elementary (thanks to @mozway for the idea):
def bin_vect_to_int(arr):
bin_units = 2**np.arange(arr.shape[1])
return np.dot(arr,bin_units)
arr = np.array([0,1,1,0,1,0,0,1,1,0,1,1], dtype=np.uint8).reshape(-1,3)
idx = binVecToInt(arr)
values = np.zeros(2**3)
values[bin_vect_to_int(np.array([[0,1,1]]))] = 5
values[bin_vect_to_int(np.array([[0,1,0]]))] = 10
values[idx]
CodePudding user response:
You can use contiguous array view of shape(3, -1), find locations of unique occurences and replace them in these locations:
def view_ascontiguous(a): # a is array
a = np.ascontiguousarray(a)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel()
def replace(x, args, subs, viewer):
u, inv = np.unique(viewer(x), return_inverse=True)
idx = np.searchsorted(viewer(args), u)
return subs[idx][inv]
>>> replace(x=np.array([1, 0, 1, 0, 0, 1, 1, 0, 1]).reshape(-1, 3),
args=np.array([[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]]),
subs=np.array([ 5, 57, 58, 44, 67, 17, 77, 1]),
viewer=view_ascontiguous)
array([17, 57, 17])
Now, fun part: you can use your own viewer. It is required to map an array you pass in args to any kind of ascending indices like so:
viewer=lambda arr: np.ravel_multi_index(arr.T, (2,2,2)) #0, 1, 2, 3, 4, 5, 6, 7
viewer=lambda arr: np.sum(arr * [4, 2, 1], axis=1) #0, 1, 2, 3, 4, 5, 6, 7
viewer=lambda arr: np.dot(arr, [4, 2, 1]) #0, 1, 2, 3, 4, 5, 6, 7
Or even more fun:
viewer=lambda arr: 2*np.dot(arr, [4, 2, 1]) 1 #1, 3, 5, 7, 9, 11, 13, 15
viewer=lambda arr: np.vectorize(chr)(97 np.dot(arr, [4, 2, 1])) #a b c d e f g h
since you could also map
[[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0], [1, 1, 1]]
to any ascending sequence you could think of like [1, 3, 5, 7, 9, 11, 13, 15] or ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
and the result remains the same.
CodePudding user response:
As the other answers indicated, you can start by reshaping your array (actually, you should probably generate it with the correct shape to begin with, but that's another issue):
example = np.array([0, 1, 1, 0, 1, 0, 0, 1, 1], dtype=np.uint8)
data = example.reshape(-1, 3)
Now running a custom python function over the array is going to be slow, but luckily numpy has your back here. You can use np.packbits to transform each row into a number directly:
data = np.packbits(data, axis=1, bitorder='little').ravel() # [6, 2, 6]
If you wanted 101 to map to 5 and 110 to map to 6, your job is done. Otherwise, you will need to come up with a mapping. Since you have three bits, you only need 8 numbers in the mapping array:
mapping = np.array([7, 4, 3, 8, 124, 1, 5, 0])
You can use data as an index directly into mapping. The output will have the type of mapping but the shape of data:
result = mapping[data] # [5, 3, 5]
You can do this in one line:
mapping[np.packbits(example.reshape(-1, 3), axis=1, bitorder='little').ravel()]
