An overview of the fiddly bits of matrix arithmetic which NumPy protects us from.
If you need to carry out any serious work with matrices in Python then your best option will usually be to use the NumPy package. NumPy provides an industrial strength array class called ndarray (n-dimensional array) which can be used to represent a matrix, and the usual arithmetic operators can be used. Additionally, you can use the @ operator to multiply two matrices together.
However, if you are trying to learn matrix arithmetic then NumPy isn’t going to help much as matrix operations aren’t very intuitive, especially matrix multiplication which is rather convoluted. The best way to learn is probably to start with a sheet of paper and a pen, working through some examples, before moving on to writing code from scratch to perform the various arithmetic functions.
In this post I will write a very quick whistlestop tour of matrix arithmetic using NumPy before moving on to the main purpose of this post: writing a simple matrix class from scratch.
Before getting into writing code I’ll give a brief overview of the various arithmetic operations that can be applied to matrices.
Matrix Addition and Subtraction
To add or subtract two matrices both must have the same shape (ie. have the same number of rows and columns) and the resulting matrix will also be of the same shape. Each value is simply the corresponding value in the first matrix plus or minus the corresponding value in the second.
Scalar Addition and Subtraction
Any matrix can have a scalar (single value) added or subtracted. The result is the same shape and with the scalar added to or subtracted from each value in the original matrix.
Scalar multiplication works in the same way as addition and subtraction, with each value in the result being the corresponding value in the original matrix multiplied by the scalar.
This is where things get tricky. To be able to multiply two matrices together the second must have the same number of rows as the first has columns. The resulting matrix will have the same number of rows as the first, and the same number of columns as the second.
A textual description of matrix multiplication is going to be wordy and confusing so let’s go with a diagram instead:
Even that might be a bit confusing so let’s split it into four stages, each highlighted in turn.
- The value in row 0*, column 0 of the result is the dot product (sum of the products of corresponding values) of row 0 of the first matrix and column 0 of the second.
- The value in row 1, column 0 of the result is the dot product of row 1 of the first matrix and column 0 of the second.
- The value in row 0, column 1 of the result is the dot product of row 0 of the first matrix and column 1 of the second.
- The value in row 1, column 1 of the result is the dot product of row 1 of the first matrix and column 1 of the second.
*Matrix row and column indexes have traditionally started at 1 but when implemented in software we’ll go with zero-based indexes.
The pattern is clear — for a given row/column in the result calculate the dot product of the same row in the first matrix and the same column in the second.
This project consists of three Python source code files which you can grab from the Github repository.
Let’s start with a very brief introduction to using matrices with NumPy.
Firstly we import
numpy (you’ll need to install it with pip if you haven’t already) and then within the single
main function create three matrices: the arguments are lists of lists, the inner lists being the rows of the matrices. These are then printed just to show their contents.
Then we create four more matrices by performing various arithmetic operations on the original three:
- Matrix addition, adding one matrix to another
- Scalar addition, adding a single value to a matrix
- Scalar multiplication, multiplying a matrix by a single number
- Matrix multiplication, multiplying two matrices
Let’s now run this code.
The output is:
Here we see seven matrices: the original three followed by the results of the various arithmetic operations. NumPy is both efficient and easy to use but as I mentioned above it is worthwhile coding an implementation of the basic matrix operations from scratch just to get your head round the topic, so let’s do just that.
The code above is a simple implementation of “toy” matrix class purely for educational purposes. Don’t even think about using it in the real world!
__init__ method allows us to create a matrix in two ways, either with a list of values (or entries in matrix jargon), or with row and column counts. If you use the former the
columncount properties are set from the supplied list, and if you use the latter you’ll get an array with all entries initialised to 0.
__str__ has only one point of (minor) interest — it uses a few Unicode character values to print out segments of square brackets, as you’ll see in a moment.
Next we have a few special methods (aka “magic” or “dunder” methods). Python maps these to the appropriate operator symbol, and the ones I’ll use are:
- __add__ maps to + (matrix addition)
- __sub__ maps to — (matrix subtraction)
- __mul__ maps to * (scalar multiplication)
- __matmul__ maps to @ (matrix multiplication)
I haven’t implemented scalar addition or subtraction but you could do this if you wish. You’ll need to check the type of the other argument, and if it is a numeric value carry out scalar operations.
__add__ & __sub__
Firstly we need to check the two matrices are addable/subtractable. If so we create a new matrix of the same shape before using nested loops to calculate the entries in the result.
This is for scalar arithmetic so we create a new matrix the same shape as the original, and then calculate the values in nested loops.
Again we use the same approach: create a matrix for the result and then populate it within nested loops. The extra complexity in matrix multiplication is farmed out to the
__dot_product function which I’ll describe in a moment.
addable, subtractable and multipliable
These methods simply check whether the various operations can be carried out between the two specific matrices.
This method takes two matrices (if you include self), and the indexes of the first’s row and the second’s column. It then calculates the dot product of these by initialising the result to 0 and then adding the products of the corresponding entries within a loop.
Matrix class completed so we now need to try it out.
The main.py file contains four functions to create Matrix objects, call the various methods and print the results. The calls to the first two in main are uncommented so let’s run them.
The output is:
In each case two of the matrices are addable or subtractable, and the others are not. We then see the original matrices and the results of the addition and subtraction. Here you get to see the various Unicode square bracket symbols.
Comment out the first two function calls in
main and, uncomment
scalar_multiplication() and run the program again.
As you can see each of the values has been multiplied by the scalar 3.
main and run again.
The matrices being multiplied are those show in the example near the top of the page, so you can follow the multiplication process through if you wish.