Linear Algebra for Machine Learning: An Introduction

Talha Quddoos
Artificialis
Published in
7 min readNov 21, 2021

If you’ve started looking into behind the scenes of popular machine learning algorithms, you might have come across the term “linear algebra”.

The term seems scary, but it isn’t really so. Many of the machine learning algorithms rely on linear algebra because it provides the ability to “vectorize” them, making them computationally fast and efficient.

Linear algebra is a vast branch of Mathematics, and not all of its knowledge is required in understanding and building machine learning algorithms, so our focus will be on the basic topics related to machine learning.

This article covers the fundamentals of linear algebra required for machine learning, including:

  • Vectors and Matrices
  • Matrix Operations (Multiplication, Addition, and Subtraction)
  • Vector Operations (Addition, Subtraction, and Dot Product)

NumPy implementations for each of the operations are also included at the end of each topic.

Matrices

A matrix is a rectangular array of numbers arranged in rows and columns. In other words, a matrix is a 2-dimensional array comprising of numbers.

The dimensions of a matrix are denoted by m n, where m is the number of rows and n is the number of columns it has. The matrix shown in the above image is a 3 3 matrix since it has 3 rows and 3 columns.

Python, by default, doesn’t come with arrays. Lists are great, but they aren’t very efficient when performing millions of numeric options. To solve this problem, we use some sort of numerical computing library, which supports arrays and fast numeric computations. NumPy is one of them and it has several other benefits apart from that.

This is how we can create a 3x3 matrix using NumPy:

Row and Column Matrices

Based on the arrangement of rows and columns, matrices are divided into several types. Two of them are row and column matrices.

A matrix having only one column is called a column matrix while a matrix having only one row is called a row matrix.

Operations on Matrices

Matrices support all of the basic arithmetic operations, including addition, subtraction, multiplication, and division.

Addition

Adding two matrices is very simple. The corresponding elements of the matrices are added together to form a new matrix. The resulting matrix also has the same number of rows and columns as the two matrices to be added.

Note that both of the matrices should have the same number of rows and columns, otherwise the result will be undefined.

Subtraction

Subtracting two matrices is just as simple as the addition. The corresponding terms of the two matrices are subtracted to form a new result matrix. Just like in addition, both of the matrices should have the same dimensions.

Scalar Multiplication

Multiplying a matrix by a number is called scalar multiplication. This kind of multiplication is very easy, as we just have to multiply each of the elements of the matrix by that number.

Scalar Division

Dividing a matrix by a scalar is also very simple. Each of the elements in the matrix is divided by the number to form a new result matrix.

Matrix-Matrix Multiplication

Multiplying a matrix by another matrix is called “matrix multiplication” or “cross product”. Matrix multiplication is very easy, but a bit tricky for beginners to understand.

Let’s take two 3x3 matrices, A and B, as shown below:

To begin with, we’ll multiply each of the numbers in the first row of A by the corresponding numbers in the first column of B. The sum will then be put in the result matrix as shown below:

We’ll again take the product of the first row of matrix A but this time with the second column of matrix B.

The same process will be repeated for the third column of matrix B.

Let’s repeat this process for the second and third rows of matrix A.

Using NumPy, we can multiply two matrices with each other by using the numpy.dot() function.

There are a few things worth noting about matrix multiplication:

  • When multiplying two matrices, the number of columns in the first matrix must be the same as the number of rows in the second matrix. In other words, the inner dimensions of the two matrices must be the same.

In the above image, the matrices A and B can be multiplied since A has 2 columns and B has 2 rows. The matrices C and D can’t be multiplied because C has only 2 columns while D has 3 rows.

  • The dimensions of the resulting matrix will be equal to the outer dimensions of the two matrices to be multiplied, i.e. the resulting matrix will have the number of rows of the first matrix and the number of columns of the second matrix.
  • The product of two matrices is NOT commutative. It means that A multiplied by B is not equal to B multiplied by A.

Transpose of a Matrix

Transposing a matrix means swapping its rows and columns with each other, i.e. changing its rows into columns and columns into rows. The transpose of a matrix is often denoted by a capital T in superscript. For example, Aᵀ will denote the transpose of matrix A.

In NumPy, a matrix (array) can be transposed either using the array’s own .T method or using the numpy.transpose() function.

Vectors

In linear algebra, a vector is a quantity having a direction along with its magnitude. However, this definition is very much Physics-based. Let’s define it in our own terms.

A vector can be thought of as an m x 1 matrix or a column matrix. The number of rows in a vector tells the dimension of a vector. For example, a vector having 3 rows will be called a 3-dimensional vector.

Unlike matrices, vectors are created as 1-d arrays instead of 2-d arrays when using NumPy.

Dot Product of Vectors

When two vectors are multiplied, the result is a numeric value and is called their dot product.
The dot product of two vectors is calculated by multiplying their corresponding elements by each other and then summing them all.

This is how the dot product of two vectors can be calculated using NumPy:

The dimensions of the two vectors should be the same when calculating their dot product.

Unlike matrix multiplication, the dot product of two vectors is commutative, meaning that A • B = B • A.

Conclusion

Using linear algebra makes machine learning computation very fast by introducing vectorization. NumPy is one of the libraries used to vectorize machine learning algorithms. This article introduced you to some basics of linear algebra for machine learning, such as matrices, vectors, matrix addition, multiplication, and the dot product of matrices. The purpose of this guide was to make you familiar with the basics of linear algebra so that you don’t feel missed out when looking at the Math behind various machine learning algorithms.

--

--

Talha Quddoos
Artificialis

An avid machine learning engineer. Writes about neural networks, ML algorithms, and the Math underlying them.