Broadcasting Numpy Arrays for Arithmetic Operations

Let's examine the rules of broadcasting, how it works, and examples of broadcasting with various arrays.

By Ravikiran Srinivasulu

Oct 29, 2018 • 7 Minute Read

Introduction

Before understanding what broadcasting is and how it works, let's understand how arithmetic operations are performed on Numpy arrays. An arithmetic operation between any two arrays is always performed element-by-element. That is if you add two arrays, A and B, every ith element of A is added to the ith element of B to produce the array C.

Everything works fine if both the arrays have the same shape. If the arrays have different shapes, then the element-by-element operation is not possible. But, in real-world applications, you will rarely come across arrays that have the same shape. So Numpy also provides the ability to do arithmetic operations on arrays with different shapes. That ability is called broadcasting.

The Rules of Broadcasting

Although you can do arithmetic operations on arrays with wide-ranging shapes, there are a few limitations. So, it helps us to know the broadcasting rules before we look at a few examples. In general, you can do arithmetic operations between two arrays of different shapes, if:

The size of each dimension is the same, or
The size of one of the dimensions is one

And you can use these rules to perform operations even between a ten-dimensional array and a two-dimensional array. The dimensionality of the arrays does not matter.

Let’s look at an example and see how this works. For example, I have two arrays, A and B, below:

          A = np.arange(12).reshape(3,4)
B  = np.arange(4)
A
    

Output:

          array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
    

Output:

      array([0, 1, 2, 3])

And their shapes:

          A.shape
B.shape
    

Output:

          (3, 4)
(4,)
    

Note that array A is a two-dimensional array and array B is a one-dimensional array or a scalar value. When you add these two arrays, Numpy broadcasts the smaller array across the larger array and the operation is successful.

How Broadcasting Works

Broadcasting starts the comparison with the trailing dimension and moves toward the leading dimension.

In the previous example, the size of the trailing dimensions matches, so it proceeds to check the next dimension. Since array B is a one-dimensional array, it does not have a leading dimension. So Numpy automatically broadcasts the value ‘1’ to the missing dimension in array B. So, after array broadcasting:

From the image, it is clear that the broadcasting rules are satisfactory and Numpy allows the arithmetic operation between the two arrays. So, if you add them:

A+B

Output:

          array([[ 0,  2,  4,  6],
       [ 4,  6,  8, 10],
       [ 8, 10, 12, 14]])
    

Broadcasting with Two-dimensional Arrays

Instead of a scalar value, let’s create B to be a two-dimensional array.

          B = np.arange(3).reshape(3,1)
B.shape
    

Output:

      (3, 1)

And if you print B:

Output:

          array([[0],
       [1],
       [2]])
    

Arrays A and B have shapes (3,4) and (3,1) respectively. The broadcasting rules satisfy each of the dimensions in both the arrays, so the arithmetic operation is possible.

A+B

Output:

          array([[ 0,  1,  2,  3],
       [ 5,  6,  7,  8],
       [10, 11, 12, 13]])
    

Broadcasting with a Three-dimensional Array

I have a three-dimensional Numpy array, A:

          A = np.arange(24).reshape(2,3,4)
A
    

Output:

          array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
    

      A.shape

Output:

      (2,3,4)

And a one-dimensional array B

          B = np.arange(4)
B
    

Output:

      array([0, 1, 2, 3])

      B.shape

Output:

      (4,)

Here again, the arithmetic operation is possible as Numpy broadcasts the smaller array B to the larger array A. So, after broadcasting, the shapes of arrays A and B become (2,3,4) and (1,1,4) respectively. They follow the conditions for broadcasting and the arithmetic operation is successful.

A+B

Output:

          array([[[ 0,  2,  4,  6],
        [ 4,  6,  8, 10],
        [ 8, 10, 12, 14]],

       [[12, 14, 16, 18],
        [16, 18, 20, 22],
        [20, 22, 24, 26]]])
    

Broadcasting with Two Three-dimensional Arrays

I have two three-dimensional arrays, A and B. Array A is the same one as the earlier with shape (2,3,4). Array B is defined as:

      B = np.arange(8).reshape(2,1,4)

And has the shape:

      B.shape

Output:

      (2,1,4)

And has the value:

Output:

          array([[[0, 1, 2, 3]],
       [[4, 5, 6, 7]]])
    

To find out if an arithmetic operation is possible with these two arrays, just revisit the rules. The leading and the trailing dimensions match. And, for the other dimension, one of the dimensions is one. So an arithmetic operation is possible.

A+B

Output:

          array([[[ 0,  2,  4,  6],
        [ 4,  6,  8, 10],
        [ 8, 10, 12, 14]],

       [[16, 18, 20, 22],
        [20, 22, 24, 26],
        [24, 26, 28, 30]]])
    

But if you try to add two arrays which cannot be broadcasted due to incompatible shapes, you will get an error.

          B = np.arange(18).reshape(2,3,3)
A+B
    

Output:

          ValueError: operands could not be broadcast together with shapes (2,3,4) (2,3,3)
    

Here, the issue is with the trailing dimension:

In general, you can look at the shape of the input arrays and decide whether broadcasting will allow the operation to be performed or not. Also, you can predict the shape of the final array which is the highest size along each dimension in the input arrays.

For example, the following array pairs can be used in an arithmetic operation:

(3,2) and (1,) produces the output array (3,2) (3,1,4,1) and (1,7,4,3) produces the output array (3,7,4,3)

The following array pairs cannot be used in an arithmetic operation due to size mismatch in the trailing dimension:

(3,2) and (1,3) (3,1,4,2) and (1,7,4,3)

Conclusion

In the examples, we observed that a smaller array broadcast (or stretches) along the length of the larger array. However, note that this understanding is only conceptual and Numpy does not create any copies of the data. So the broadcasting operations are memory efficient.

Ravikiran S.

Ravikiran is an independent cloud consultant and author focused on developing solutions in Microsoft Azure. His interests include everything in the cloud space, DevOps and Machine Learning with contributions in domains like Healthcare, Banking and Web Analytics. He is very passionate about the latest and futuristic technologies and constantly updates himself with the current technology trends. He works at the intersection of education and technology. In spare time, he likes going on long road trips with family and friends.

More about this author