Linear Algebra for Deep Learning

This article aims to give the reader an understanding of the linear algebra aspect of mathematics one needs to know to start programming or developing machine/deep learning models and gain an understanding of them. Each section corresponds to a unique linear algebra operation. I hope this paper is easy to read and understand to a person who has just a basic high school level understanding of mathematics.

Dimensions

Above is a “matrix of numbers”. Why did I call it “2 by 2”? It has 2 rows and 2 columns. So, for example, this matrix would be a “3 by 2” or just 3x2:

Additionally, the symbol commonly used to represent matrix dimensions is the all-real-numbers symbol: ℝ. The dimensions of the previous example could be written as ℝ^3x2.

Indexing

Vectors

then, Y₁=1, Y₂=2, and Y₃=3.

Just a quick note: all the indexing I have written on is what is called “1-indexing” because the first value in the matrix is referred to with a 1. If you are familiar with programming, then you will most likely be familiar with zero-indexing, where the first value in an array (or matrix) is the 0th element. 1-indexed vectors/matrices are the most common. Another note: matrices and vectors are often named using CAPITOL lettering by convention.

Simple Operations

This works the same way for subtraction, multiplication, and division using scalar values. With element wise operations, you take two matrices of the SAME dimensions and perform one operation with each corresponding element in each matrix. For example:

Here is an example where you can’t perform the element-wise operation because the two matrices do NOT have the same dimensions:

Matrix-Vector Multiplication

To perform the dot product, we first take the first row of the matrix, [1,3], and multiply each element in it with the corresponding element in the vector like this:

Now you add the values up, 1+15=16. This becomes the first value in the resulting matrix.

You now perform these same steps with the rest of the rows in the matrix. For the sake of brevity, I will put the final operations in the matrix:

And that final matrix is the answer. It’s not difficult to understand, just tedious to execute. This is why NO ONE does this by hand, we use computers to do this for us. In python, using the numpy library, you can just say this to perform that entire dot product: numpy.dot(matrixA, matrixB).

Additionally, because the operations for dot product are so specific, you cannot perform them with just two randomly dimensioned matrices, they have to be specific as well. If the matrix has dimensions ℝᴹˣⁿ(m rows, n columns), then the vector must have the dimensions ℝⁿˣ¹(n dimensional vector). The answer would then be a vector with the dimensions ℝᴹˣ¹.

Matrix-Matrix Multiplication

This will make more sense in an example. To work out the answer for the 1st row and 1st column of the resulting matrix in this problem, I would find the dot product of the 1st row of the first matrix and the 1st column of the second matrix like so:

To work out the answer for the 2nd row and 1st column of the resulting matrix, I would find the dot product of the 2nd row of the first matrix and the 1st column of the second like so:

We can do the same thing for the 1st row and the 2nd column:

And for the 2nd row and 2nd column:

And finally, we get:

The Why

Well, to be honest if you aren’t doing something related to mathematics or computer science (machine learning) I would struggle to give you a good reason that you need to know it. But, for machine learning, it is EXTREMELY useful.

When modeling the layers of a neural network in a program on a computer, each layer can be represented by a vector and the weights as a matrix. Then, when it comes time to forward propagate, the next layer of the network is calculated by the dot product of the previous layer (the vector) and the weights. There are actually many cloud computing services that have computers you can access that are specially designed to be able to perform matrix operations quickly which greatly improves the training process for a network.

This is all I got for this one! Please feel free to email me with any questions 👍📬

Singer, dev, native omahan; dannydenenberg.com