An Introduction To Linear Algebra

Published in

Analytics Vidhya

6 min readJan 6, 2021

What is Linear Algebra

Linear Algebra is a branch of mathematics that lets you concisely describe coordinates and interactions of planes in higher dimensions and perform operations on them.

Think of it as an extension of algebra (dealing with unknowns) into an arbitrary number of dimensions. Linear Algebra is about working on linear systems of equations (linear regression is an example: y = Ax). Rather than working with scalars, we start working with matrices and vectors (vectors are really just a special type of matrix).

Coordinates

x = horizontal
y = vertical

Vectors

A vector, in CS is often characterised as an array with values inside of it. So for a two dimensional vector, with an x of 2 and a y of 1, it would look like this: [2,1].

Vector Addition

Let’s say that we have a vector of [2,1] and then a second vector that is [4,-3]. If we want to add these two vectors together, we would add the values together that correspond with one another (ie. add x and x, add y and y). So in this case, our addition would result in [6,-2]. Sometimes if we are visualising this on a graph, we could take our first set of [2,1] and plot that from the origin (which would usually be [0,0], then take the second value of [4,-3] and plot that as if it was a continuation from [2,1] on the graph. The result would still be the same as if you had taken your final value of [6,-2] and had plotted that. Only now we are able to see how the graph progresses through each value (if so required).

Scalars

This involves taking the value of a vector and multiplying it by whatever value is passed to it. This is known as scaling. The number itself that we use to multiply by is a scalar

Scalar Multiplication

Here are some examples:

v = [2,1]

Our vector has an x coordinate of 2, and a y coordinate of 1.

2v = [4,2]

Here we basically multiply [2,1] by 2. Which would give us [4,2] plotted out on a graph.

-1.8v = [-3.6, -1.8]

Here we take [2,1] and first we flip [2,1] on its axis and then do the multiplication as if -1.8 was actually now 1.8. Which gives us something close to [-3.6, -1.8]. So simplify how this operates, we can consider that if we are trying to multiply by a negative value, we can switch the values in our vector from positive to negative (or negative to positive), and then treat that initial minus value as a positive value.

1/3v = [0.66, 0.33]

Here we take [2,1] and reduce it down to 1 third of its values, so we would then plot [0.66, 0.33] out on out graph.

The XY Coordinate System

With Vectors, we can think of each vector value as a scalar that operates on the xy coordinate system. In the xy coordinate system, there are two very special vectors: the one that runs to the right of the origin, which is ‘i’ and the one that runs vertical from the origin, which is ‘j’. These both have a value of 1. These are what we can refer to the ‘basis’ of a coordinate system.

So now we can look at our two vector values of [2,1] and consider each of those to be a scalar that stretches i and j along their axes. So now we have 2i and 1j. We can then take these two scaled vectors and add them together. This would look like (2)i + (1)j.

Any time we scale two vectors and add them together it is called a ‘linear combination’.

One thing to bear in mind is that we can theoretically use different basis vectors if we wanted to. So if our basis vectors actually ran at values of 2 instead of how ‘i’ and ‘j’ run at values of 1, our original vector of [2,1] would no longer plot at the same place on our graph. It would actually end up being [4,2] instead.

Linear Transformations and its relation to Matrices

Transformation basically just means function. So a function that takes an input and returns an output. So a transformation would take in a vector and return another vector.

The word ‘transformation’ is used because it helps to signify movement. So it is like watching the input vector move from its position over to its new position (the output vector).

Visually speaking, a transformation is linear if it has two properties: 1. all lines must remain lines; 2. the origin must remain fixed in place. So if a line curves, it’s not a linear transformation.

If we remember that the values of a vector can actually be used to scale along i and j — for example: v = 2i + 1j — we can carry out a transformation, the properties and gridlines would still remain evenly spaced. The place where v lands would still be 2i + 1j.

So we can transform our vector (which means i and j are also transformed) we still get the same linear combination. This means that we can deduce where v must go based only on where i and j land.

Visualising Linear Transformations

A way we can try to visualise this would be if we had a grid and had a vector placed out on it, we could imagine that if vector placement remained static, but we actually rotated the grid itself, the vector would now be in a new position, but the calculation of how it arrived there would remain the same, even though the values for i, j and v would be different.

Bear in mind that we don’t have to transform simply by just rotating the axis. We could stretch out the positions of i and j if we wanted to, so that — for example — i is now twice as long as it was before, while j is whatever it now corresponds to.

So if we had i and j, then rotated our grid 90 degrees counterclockwise, i would move from [1,0] to [0,1]. j would rotate from [0,1], to [-1,0]. We could take those values (whether that be the ones before or the newly rotated values) and create 2x2 matrices from them. It would look like this (imagining that the square brackets actually are one large horizontal rather than two small horizontals on top of each other):
[0 -1]
[1 0]

Every time you see a matrix, you can consider it to be a linear transformation in space.

Regarding the transformation, it is worth bearing in mind that this I think this only works when a grid transformation still takes up the same surface area as before. There are things called rotations, which rotate, and shears, which transform (ie stretch), but the diagrams I have seen thus far still take up the same amount of space.

So a rotation might be rotating the grid by 90 degrees, while the shear might stretch out a rectangular grid space into a parallelogram. Sometimes this new transformation of both rotation and shear is called a ‘composition’.

Matrix Multiplication

Matrix multiplication represents applying one transformation after another.
The order of transformations matters also as it has an effect on the outcome.

Function Notation

Since we write functions on the left of variables. whenever we have to compose two functions, we read from right to left.
imagine that the brackets actually span the entire height rather than being stacked on top of one another:
[a b] [e f] = [ae+bg af+bh]
[c b] [g h] [ce+dg cf+dh]