ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Numerically Python

Broadcasting With NumPy

09/27/2000

Introduction

This month's challenge will be to look at one of the most obvious but often confusing aspects of NumPy operations: broadcasting. When you perform operations between multiarrays of differing number of dimensions, one or both of the arrays may be broadcast so the dimensions will match. In its simplest form, this feature is very straightforward, but its use in complex expressions can be difficult to master. This month we’ll take a look at this fundamental capability of NumPy and how to put it to use.

NumPy does not inherently support the mathematics of linear algebra. The traditional functionality of linear algebra was left instead to library functions. NumPy's implementers focused on the complex capabilities of N-dimensional objects (the multiarray) --broadcasting provides that support. This combination approach led to a very powerful and useful package that satisfies a large development community. But it also led to some confusing clashes in terminology.

Rank

In the NumPy documentation, the word "rank" refers to the number of dimensions of the multiarray. It has nothing to do with the condition of the object itself (e.g., the number of linearly independent rows / columns, etc). A column vector is rank one, a matrix is always rank two, and an N-dimensioned multiarray is always rank N. In NumPy, rank refers to the shape of the multiarray. This will be confusing for any serious math types reading along. In traditional linear algebra, rank refers to related numbers within a matrix column or row.

Also in the NumPy documentation, the word "axis" refers to a dimension. Thus a rank 5 multiarray has five dimensions that can be referred to as the first through fifth axes. The length of an axis is the same as the length of a dimension.

These definitions can be confusing, especially when many of the terms are overloaded uses of traditional mathematical notions. Stick with it, however; soon you will be slinging the slang of NumPy with the best of them.

Broadcasting

Broadcasting refers to smaller ranked multiarrays being either replicated or extended in a natural way to work with higher ranked multiarrays. By default, all binary operations, both mathematical (addition, subtraction, multiplication and division) as well as logical (and, or, and xor), work element by element between two multiarrays. The first thing a new NumPy user learns is that for binary operations to complete, the number and size of all dimensions must match. Well, that is not the whole truth. There are in fact several cases for which this is not true -- all are instances of broadcasting.

Before we dig into some examples, remember that you need to import the Numeric (NumPy) package to follow along. All examples can be run in the interactive Python command window once NumPy is installed. After starting up Python, execute the following command (this only needs to be done once per session).

>>> from Numeric import * 

If you don't get any errors, you are all set to proceed. If an exception was raised, it probably means NumPy was not installed correctly; see the NumPy documentation for help.

Multiarray and scalar operations

For our first look at broadcasting, let's try to add a scalar to a multiarray. Given a 2-by-2 array a and a scalar b, the result is shown.

>>> a = array([[1,2],[3, 4]])
>>> a
array([[1, 2],
       [3, 4]])
>>> b=1
>>> a+b
array([[2, 3],
       [4, 5]])

The result is as expected. But wait! The rank of the two operands did not agree? The scalar was broadcast (replicated) in a natural way to allow the operation to complete. Essentially the scalar was turned into a 2-by-2 multiarray where each element had the value of b; then the operation continued. The operator in this example was addition; as all operators exhibit the same behavior (element-by-element operation), one is as good as another for demonstration purposes. Go ahead and try the other operators to convince yourself. What would happen if you multiplied a by b rather than adding them?

Multiarray and multiarray

Scalars are simple enough. Let's move on to more complex operations where we have two multiarrays.

In the next example we have equivalent rank, but the length of each dimension does not agree.

>>> a
array([[1, 2, 3],
       [4, 5, 6]])
>>> b
array([[1, 2],
       [3, 4],
       [5, 6]])
>>> a+b
Traceback (innermost last):
  File "<stdin>", line 1, in ?
ValueError: frames are not aligned

We get the dreaded "frames not aligned" error. This was caused by a being a 2-by-3 multiarray and b being a 3-by-2 multiarray.

Let's try another multiarray example.

>>> a
array([[1, 2, 3],
       [4, 5, 6]])
>>> c
array([7, 8, 9])
>>> a+c
array([[ 8, 10, 12],
       [11, 13, 15]])

OK, so why did this one work when the length of the dimensions did not match? This is another example of broadcasting. The a array is of rank 2 and c is of rank one. In this case the size of each individual dimension (or axis) is compared, and under certain conditions the lower rank array is replicated. To gain insight, let's look at the shape tuple for both a and c. Remember the .shape attribute returns the tuple defining the rank (the length of the tuple) and the length of each axis (the values in the tuple).

>>> a.shape
(2, 3)
>>> c.shape
(3,)

When two multiarrays of differing rank are operated on, the length of each axis is compared starting from the right-most. Since the lowest (right-most) axis for both a and c is three, we are in business. When c is broadcast, it is essentially replicated as many times as necessary to match the upper dimensions of a, in this case two. This methodology also holds for objects whose rank is greater than one or two.

In the next and following examples, we will use the zeros() function to create multiarrays of a given shape. Yes, you guessed it, filled with zeros! For the purpose of our discussions, we are not particular about the values in the multiarray -- rather, we are interested in the shape of the object. The zeros() function provides a simple, convenient way for generating multiarrays of the desired shape.

>>> x=zeros((3,4,5,6,7))
>>> y=zeros((7,))
>>> z=x+y

In this example we created a rank five multiarray x and a rank one multiarray y. When the addition operator tried to add these together, it needed to broadcast the y multiarray across the x multiarray, essentially replicating as many times as necessary. In this case the four higher axes were "created" for the y multiarray.

Just to throw in a twist, there is an exception to the above discussion. When comparing the size of each axis, if either one of the compared axes has a size of one, broadcasting can also occur. As an example

>>> z
array([1, 2])
>>> z.shape
(2,)
>>> v
array([[3],
       [4],
       [5]])
>>> v.shape
(3, 1)
>>> z+v
array([[4, 5],
       [5, 6],
       [6, 7]])

In this form, the first multiarray z was extended to a (3,2) multiarray and the second multiarray v was extended to a (3,2) multiarray. Essentially, broadcasting occurred on both operands! This only occurs when the axis size of one of the multiarrays has the value of one.

NewAxis

Let's take a look at a similar example. Create two multiarrays as follows:

>>> z
array([1, 2])
>>> z.shape
(2,)
>>> w
array([3, 4, 5])
>>> w.shape
(3,)

By all the rules we know so far, if we were to add these two objects, an exception would be raised. To accomplish the operation and kick in broadcasting, the w multiarray needs to be altered. We can change the shape of w with the reshape function to generate the v array.

>>> v=reshape(w,(3,1))
>>> v
array([[3],
       [4],
       [5]])
>>> v.shape
(3, 1)

Now the operation can proceed with the aid of broadcasting.

In order to support operations of this type and not to rely on the reshape function, which can clutter up the code, NumPy's implementers invented the NewAxis index.

NewAxis is a pseudo-index that allows the temporary addition of an axis into a multiarray. If you were to try the previous example as a single line of code, you might perform the following:

>>> z = array([1,2])
>>> w = array([3,4,5])
>>> z+reshape(w,(3,1))
array([[4, 5],
       [5, 6],
       [6, 7]])

Using NewAxis, the reshape function need not be called and the code is a bit more streamlined.

>>> z+w[:,NewAxis]
array([[4, 5],
       [5, 6],
       [6, 7]])

These two examples do the same thing but differ in implementation. The first reshapes the array to the desired layout; the second uses the slicing operator to recreate the array while adding in an axis using NewAxis. For more examples of slicing, refer to last month's article.

Here is a more interesting example of slicing with NewAxis:

>>> a=zeros((3,4,5,6))
>>> b=zeros((4,6))
>>> c=a+b[:,NewAxis,:]

In this case we inserted a temporary axis (supporting broadcasting) between the first and second axes of b. Note that since the resulting b multiarray was only of rank three and a of rank four, broadcasting also occurred at the left-most index!

Do you see the pattern in the examples? Broadcasting happens when the ranks of the two multiarrays in question are not equal. When this happens, a set of rules comes into play whereby each axis is compared for length and adjustments made. A missing axis can be filled in to make the operation work; an axis of length 1 can also be overridden.

So what good is broadcasting?

The use of broadcasting is so inherent in NumPy operations, living without it would be hard. Consider the operation of trying to add 1 to each value in a multiarray or scaling all the values by 2. Both of these operations rely on broadcasting to succeed. It is hard (if not impossible) to imagine how to accomplish these operations without broadcasting.

Consider creating a rank 3 multiarray with each element set to 5 with broadcasting:

>>> a = ones((1,2,3)) * 5

Without broadcasting this becomes UGLY!

>>> a = ones((2,3,4))
>>> tmp = a.shape
>>> for i in range(tmp[0]):
...     for j in range(tmp[1]):
...             for k in range(tmp[2]):
...                     a[i,j,k] = a[i,j,k] * 5

In more complex cases, having to multiply each row of a matrix by a vector without broadcasting would require that the vector be replicated first into a matrix; then the operation could be performed. With broadcasting, the replication stage can be avoided. In cases where large multiarrays are in play, this can be a significant memory- and time-saving feature.

Bottom line, broadcasting allows the programmer to avoid the step of creating the intermediate and dimension-matched arrays. But be careful with your knowledge; the next time you use a complex form of broadcasting, help yourself and those who follow by giving them an insight into your thinking and the operation at hand -- yes, comments are helpful!

End game

This month we looked at examples of broadcasting support by NumPy. These rules provide ways for multiarrays to interact when their ranks are not equal. The rules can be a bit confusing, but exploiting them will help make full use of the capabilities of NumPy.

Next month we will put together a larger scale application, bringing into play many of the features we have learned in this series of articles. I'll show how you can integrate NumPy with the multimedia capabilities of your computer to generate a hands-on application that employs NumPy (including broadcasting ;) ), the FFT module, and the DISLIN plotting package! See you next month!

Eric Hagemann specializes in fast algorithms for crunching numbers on all varieties of computers from embedded to mainframe.


Read more Numerically Python columns.

Discuss this article in the O'Reilly Network Python Forum.

Return to the Python DevCenter.

 

Copyright © 2009 O'Reilly Media, Inc.