Some answers on stackoverflow suggest to use a ndarray of ndarray, when working with data in which the number of elements per row is not constant (How to make a multidimension numpy array with a varying row size?).
Is numpy optimized to work on a structure like that ?
Here's a simplified example of such a structure:
import numpy as np
x = np.array([1,2,3])
y = np.array([4,5])
data = np.array([x,y],dtype=object)
It's possible to do operations like:
print(data 1)
print(data data)
But some operations would fail like :
print(np.sum(data))
What's happening behind the scenes with this type of structure ?
CodePudding user response:
Like a list, an object dtype array can contain objects of any kind. For example
In [6]: arr = np.array([1,"two",[1,2,3],np.array([4,5,6])], object)
In [7]: arr
Out[7]: array([1, 'two', list([1, 2, 3]), array([4, 5, 6])], dtype=object)
Look what happens when we do addition:
In [8]: arr arr
Out[8]:
array([2, 'twotwo', list([1, 2, 3, 1, 2, 3]), array([ 8, 10, 12])],
dtype=object)
In [10]: arr*2
Out[10]:
array([2, 'twotwo', list([1, 2, 3, 1, 2, 3]), array([ 8, 10, 12])],
dtype=object)
For list and strings, these operations are defined as 'join/replication'. It's in effect doing [x.__add__(x) for x in arr]. where __add__ is the class specific operation.
np.exp doesn't work because it tries to do [x.exp() for in arr], and almost noone defines an exp method.
In [11]: np.exp(arr)
AttributeError: 'int' object has no attribute 'exp'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<ipython-input-11-16c1c90aa297>", line 1, in <module>
np.exp(arr)
TypeError: loop of ufunc does not support argument 0 of type int which has no callable exp method
