Multiplying Matrices and Vectors.

7 min readMay 19, 2022

Linear Algebra is defined as :

a branch of mathematics that is concerned with mathematical structures closed under the operations of addition and scalar multiplication and that includes the theory of systems of linear equations, matrices, determinants, vector spaces, and linear transformations

in short, it is a branch of mathematics that deals with vectors, matrices, and linear equations.

Whats the point of learning Linear Algebra you might ask? Well, having a good understanding of Linear Algebra will give you the knowledge to understand how a lot of machine learning algorithms work under the hood so to say, which will give you an overall better understanding when working with or optimizing models.

Scalars, Vectors, Matrices and Tensors

Before jumping into the math, we first need to get an understanding of the basic components we will be working with.

Scalars:

Single numbers, written in lowercase italics such as u.

Vectors:

In its simplist form, a vector is just an array of numbers, the variable name is usually written in lowercase with bold typeface such as x. We can identify each index in the array by there individual numbers.

Furthermore, vectors can be thought of as identifying points in space, with each element giving the coordinate along a different axis. In machine learning this can be thought of as just a list of attributes for an object, for example, house = [price, sqr feet, yard length, room count, bathroom count].

Matrices:

A 2D array of numbers. We usually give matrices uppercase variable names with bold typeface such as A. We identify the values within the matrix with its row (m) and column (n), as well as giving the values listed with commas in the order of row then column, or A(row, column)

Tensor:

Pretty much a compination of the vectors and matrices, or in other words a multidimentional array.

In short:

This is of course a broad simplification of these components and they go much deeper, but when it comes to machine learning this is all you really need to understand. In relation to code, these can all be thought of as nothing more than integers and arrays of integers.

Multiplying Matrices and Vectors

Matrix Multiplication:

We can write the matrix product by placing two or more matrices together like this:

The matrix product of matrices A and B is a third matrix C, and for this to work, A’s columns (k) must be of the same length as B’s rows (k). Sapose A has a shape of i x k and B has a shape of k x j, then C is of shape i x j. In other words, the output C is a matrix with a row length of A’s rows, and a column legnth of B’s columns, along with the muliplied scalars inside. The actual operation looks like this:

*dont forget the value placements are ALWAYS in the order of (rows, columns)

Here is an example of the process done on a real set of matrices:

*Assuming the first column is A and the second is B, notice we multiply A’s rows with B’s columns. Notice also that the final results columns are of of A’s column legnth, and its rows are of B’s row length.

Some other things to be aware of in matix multiplication are:

Distributive, meaning A(B + C) = AB + AC
Associative, meaning A(BC) = (AB)C
Not communative, meaning AB ≠ BA

Vector Multiplication:

For vector multiplication we utilize the dot product or scalar product. The dot product is special in that it gives you two important pieces of information about a set of vectors, we’ll talk about both definitions in this section as they are both important. The equation is usually represented like this a.b, using a dot inbetween the two vectors and returns either a positive or negative number in scalar form.

As I said, the two definitions of the dot product are Algebraic and Geometric, Well go over both below.

Algebraic Definition:

The dot product referring to the algebraic definition is an operation that takes two vectors of the same length, and gives you the product of the magnitudes (length) of the two vectors. So for example, lets say we have two vectors a and b:

*Note that the vectors a and b dont only have to be rows like the example above, they can both be columns as well. You can also get the dot product of both a row and column, just make sure its in the right order, a row dotted with a column works, but a column dotted with a row wont work.

In order to get the dot product of the two, you would multiply each value in the vectors to its corresponding values in the other, and then add them up. heres the equation below:

Here is an example:

*This example is written a little diferently than what is recommended with lowercase and italic letters, but the point still gets across that you just times a’s rows by b’s columns and add them up.

Geometric Definition:

Imagine you have an euclidean space:

In this space, you can imagine vectors as lines in an euclidean space with both a maginitude and direction in that space. The maginitude of a vector refers to the legnth, or size of the vector and is denoted as ∥a∥, heres a link if your interested in how to find this. The direction on the other hand refers to which way the line points. Assuming we are still using the same a and b as above, the dot product can be defined like this:

Notice, in both definitions the equation is still the same

The cosθ represents the cosine, or the angle between a and b. So the dot product of two vectors is equal to the lengths of both vectors times the angle of them, what does this mean? It means it gives you the ability to find the vector projection of one vector over another:

*p is equal to the projection, or length of a onto b. Its also helpful to think of p as the shadow of a onto b.

The cosine can be calculated like this:

If the cosθ is equal to 0, this implies that a and b are orthogonal, meaning there angles are at 90°:

If cosθ is equal to 1, this implies that a and b are codirectional, meaning coinciding in direction:

The above equation implies that the dot product of a vector with itself is its maginitude squared:

Which also gives the euclidian length of the vector:

Some other things to be aware about for the dot prodcut between two vectors are:

Equation is the same for both definitions
Distributive, meaning a.(b+c) = a.b + a.c
Not associative, meaning a.(b+c) ≠ (a+b).c
Its cummutative, meaning a.b = b.a

Summary on Dot Product:

So all in all, the dot product not only gives you the maginitudes (lengths) of two vectors, it also gives you the cosine of the angle between two vectors.

Conclusion:

That is all for this article, I will be writing more in the future on this topic.

Sources:

https://www.deeplearningbook.org/contents/linear_algebra.html

https://www.cuemath.com/algebra/dot-product/

https://en.wikipedia.org/wiki/Dot_product