In https://numpy.org/doc/stable/reference/generated/numpy.einsum.html it is mentioned that
Broadcasting and scalar multiplication: np.einsum('..., ...', 3, c) array([[ 0, 3, 6],[ 9, 12, 15]])
it seems einsum can mimick prefactors alpha/beta in DGEMM http://www.netlib.org/lapack/explore-html/d1/d54/group__double__blas__level3_gaeda3cbd99c8fb834a60a6412878226e1.html
Does it imply that it (include scalar multiplication inside einsum as one step) will be faster than two steps: (1) A,B->C and (2) C*prefactor?
I tried to extend https://ajcr.net/Basic-guide-to-einsum/ as
import numpy as np
A = np.array([0, 1, 2])
B = np.array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
C = np.einsum('i,ij->i', 2., A, B)
print(C)
and got ValueError: einstein sum subscripts string contains too many subscripts for operand.
So, my question is, is there any method to include scalar factor inside einsum and accelerate the calculation?
CodePudding user response:
I haven't used this scalar feature, but here's how it works:
In [422]: np.einsum('i,ij->i',A,B)
Out[422]: array([ 0, 22, 76])
In [423]: np.einsum(',i,ij->i',2,A,B)
Out[423]: array([ 0, 44, 152])
The time savings appears to be minor
In [424]: timeit np.einsum(',i,ij->i',2,A,B)
11.5 µs ± 271 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [425]: timeit 2*np.einsum('i,ij->i',A,B)
12.3 µs ± 274 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
another example:
In [427]: np.einsum(',i,,ij->i',3,A,2,B)
Out[427]: array([ 0, 132, 456])
