Home > OS >  Different broadcasting rules between R and Python when I use elementwise multiplication (*)
Different broadcasting rules between R and Python when I use elementwise multiplication (*)

Time:01-27

I've tried elementwise multiplication (*) of a matrix and a vector in R and Python, and received different results as follows.

This is R:

#R input
a=matrix(c(1,2,3,4,5,6,7,8),byrow=T,nrow=4)
b=c(9,10)
print(a)
print(b)
print(a*b)

#R output
# a
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
[4,]    7    8
# b
[1]  9 10
# a*b
 [,1] [,2]
[1,]    9   18
[2,]   30   40
[3,]   45   54
[4,]   70   80

This is Python:

#Python input
a=np.array([[1,2],[3,4],[5,6],[7,8]])
b=np.array([9,10])
print(a)
print(b)
print(a*b)

#Python output
# a
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
# b
[ 9 10]
# a*b
[[ 9 20]
 [27 40]
 [45 60]
 [63 80]]

It seems that R and Python expand vector b differently when carrying out *. R uses the following matrix to multiply a element-by-element:

     [,1] [,2]
[1,]    9    9
[2,]    10   10
[3,]    9    9
[4,]    10   10

While Python uses the following matrix:

[[ 9 10]
 [ 9 10]
 [ 9 10]
 [ 9 10]]

Could anyone explain why they have different ways of broadcasting for *? Does Python have any function or operator which results in the same result of R?

Thank you!

CodePudding user response:

I don't know R, so won't try to explain its behavior.

In [130]: a=np.array([[1,2],[3,4],[5,6],[7,8]])
     ...: b=np.array([9,10])
In [131]: a.shape
Out[131]: (4, 2)
In [132]: b.shape
Out[132]: (2,)
In [133]: a*b
Out[133]: 
array([[ 9, 20],
       [27, 40],
       [45, 60],
       [63, 80]])

The broadcasting rules for numpy are

  • add leading dimensions as needed
  • adjust size 1 dimensions as needed.

In your case this amounts to:

(4,2) * (2,) => (4,2) * (1,2) => (4,2) * (4,2) => (4,2)

The b is treated as a (4,2) array, with 9 and 10 in the 2 columns:

In [134]: a*b.reshape(1,2)
Out[134]: 
array([[ 9, 20],
       [27, 40],
       [45, 60],
       [63, 80]])

CodePudding user response:

The R code uses the recycling rule, which says that if a vector is too short, it will be repeated as many times as needed to match the other operands. The Python/numpy uses broadcasting rules which describe what happens if an operation involves arrays of different shapes. These rules are not the same, thus you obtain different results.

There are various ways in which you can get the same result as in R using numpy. For example:

a * np.tile(b, (1,2)).T 

It gives:

array([[ 9, 18],
       [30, 40],
       [45, 54],
       [70, 80]])
  •  Tags:  
  • Related