Dot-Product — Algebraic, Geometric and Linear Algebraic intuition and how to apply this to solve real world problems.

Leena Bora
13 min readJul 22, 2020

--

For now unlearn everything that you know about dot-product because there is a high chance that we know just bits and pieces of dot-product from different perspectives (Algebraic, Linear Algebraic and Geometric). Due to this, we fail to see a bigger picture and hence we feel dot-product is a very abstract concept.

So, please pretend, you’re hearing a word dot-product for the first time. Wait, I will explain what I mean by unlearning “what you know about dot-product”

Good! To answer the above questions, did you have to think about any specific domain? geometry (like angle, coordinates etc)? physics?

Probably not! These operations addition, linear equations are so fundamental and applicable to each and every domain and Math stream.
So we just think of them as operations on numbers without tying them to any domain and any math stream.

(GK Break : We (non mathematicians) just know very few Math streams like, Algebra, Geometry, Calculus, but there are around 60–70 Maths streams. 😲)

Cool, So when you see the operation below, don’t think anything apart from some operation on numbers. (that is don’t think about machine-learning, geometry or linear algebra.)
That is what I meant by unlearning.

Below are 2 list of numbers and we are performing some operations on that:

So basically, we are doing element wise product of 2 lists and adding them.

(Again if you’re tempted to relate this to linear algebra and geometry, please stop! As of now, it has nothing to do with that. It is just some operation on a list of numbers.)

Mathematicians who had given names to different operations like Division, Linear Equations, they have named this operation as “dot-product”.

This is what dot-product is.

Now, you might ask, wait when we have read about dot-product, it is explained in terms of projection of vectors, or cosine(angle) and magnitude of vectors… What is all that?
Wait! We will come to that gradually.

But before that, when I first saw the above dot-product operation, the very first question that came to my mind was, why even this operation is useful? I understand why Addition, Why Linear equations but why this particularly weird looking dot-product operation is useful?

Let me first explain to you that.

Suppose I give you two numbers like 5 & 20 and ask you to use any operation between them so that result is 40. Can you equate LHS and RHS?

You will say, it is not possible with above 4 operations. So now I relax the problem for you by introducing 2 new variables W1 and W2 and you can use any values for them to get 40:

Now you’re very happy as you arrive with a result.

I reply, “that’s good. But don’t be too happy. As I have relaxed the problem for you, I have also added a new constraint.”

Basically, now you can use any values for W1 and W2 of your choice provided they should work for all 5 equations below:
Again, LHS = RHS

Now you become restless and ask confusingly, why have you just given a choice of multiplication for W1 and W2. You should have given flexibility of doing subtraction and division too with W1 and W2. That would have made life a little better.

I look at you with a smiling face 😊 and I reply. I have already given you that flexibility by providing a multiplication operator (*).
If you multiply with -1 (W = -1), it is as good as doing subtraction.
If you multiply with 1/5 (W = 1/5), it is as good as dividing by 5.

Now knowing the power of multiplication operators, you feel like a detective. You take out your pen and paper and try out all combinations for W1 and W2 which will work for all rows.

And finally you come up with the answer:

Now, since the detective in you is still alive, you look at these operations carefully and ask. Doesn’t that look like a dot-product operation which we have seen above?
(5,20).(W1.W2) = 5 * W1 + 20 * W2

I — “yes”. Each and every equation in the above example is indeed a dot-product and W1 and W2 are shared between all dot-product equations.
(As a whole, this is nothing but a Linear Regression)

You — That’s fine. But still I haven’t understood, how does this dot-product business is useful to the real world?

I ask — With the help of dot-product (that is the format given in above figure, where W1 and W2 are shared between all dot-product equations.) Can you solve the following real world problem?

I give you the patient’s Height, Daily Calorie Intake and Daily Walking in km as LHS and patient’s weight as RHS. (same format as above, LHS and RHS)
Can you equate LHS = RHS with above idea.

Now, you have to equate 3 equations (3 patient’s rows) with the help of common W1,W2 and W3.

Now, you take out your calculator, try a few combinations of W1,W2 and W3, which will work for all 3 equations. You finally get the answer 👍

Now you become more curious and say — “Interesting. I could solve this. But how is this going to be useful for someone”

I reply — imagine instead of 3 patients, if I give you thousands of patient’s data and if you come up with these W1,W2 and W3 which satisfies to all patients, there are high chances that we will be able to predict, weight for a new patient given we have his Height, Calorie and Walking data with the help of these W1,W2 and W3.
Something similar to this (last row):

You- I see! Tell me one thing, I was able to calculate W1,W2 & W3 because I had to satisfy only 3 rows. But if I were to satisfy 1000s of patients, How can I do that?

I — Right! You won’t be able to find it by hand. You can find optimal W1,W2 & W3 with the help of calculus and probability. But let’s not go there as our motive is to understand Dot-Product.

Enlightened you (hopefully) — Wow! Now I have understood what dot-product is. I can’t wait, can we please see it’s Geometric intuition?

I — Yes. What we have seen so far is Algebraic Intuition.

Let’s formalize the dot-product’s definition as below:

Geometric Intuition

Reminder: Unlearn everything that you know about Dot-Product from a geometric perspective. Let’s start fresh.

As of now, you just know, dot-product operation is carried out between 2 lists of numbers.
[a,b,c].[x,y,z] = a*x + b*y + c*x

Example:

These 2 list if numbers can represent anything.
e.g.
[a,b,c] = Tina’s English, Math, Science score
[x,y,z] = Mina’s English, Math, Science score

For now, just keep this operation in the back of your mind and let’s solve a new problem to understand geometric intuition.

We know that we use Cartesian space as an abstraction of our physical space.

As we said, cartesian plane is an abstract representation (2D) of physical world, let’s consider below example:

Tina and Mina are best friends and mostly have the same personality traits (may be that’s the reason they are friends). But they differ when it comes to education and wealth.

Now if we want to compare both of them on these 2 parameters (education and wealth) then how to do that?

It will be good, if we have a single numerical measure denoting this comparison i.e. how they are related (close or far) w.r.t. these 2 parameters.

We will see that, first let’s plot them on our abstract geometric plane to see their relative position w.r.t these 2 parameters.

Geometrically, Let’s represent them as vectors.

As you can see, vectors are just a list of numbers e.g. [50,75,60,70] representing numerical properties of any entity (here 4 properties).
(Geometrically, these numeric properties denote coordinates.)

In other words, for example, now onwards, Mina will be represented abstractly as v1 =[12,30] meaning her education is upto 12th grade and she has 30k wealth.

With this vector representation, can you think of some idea to measure how close/far they are w.r.t these 2 parameters? 🤔

One way we can think of it is, we measure angle between these 2 vectors.
If the angle is small, they are closely related (w.r.t these 2 combined properties) and if it is big they are very different.

There is one more way, we can check the closeness of 2 vectors by finding how much v1 component is present in v2.

  • As we can see, in the first figure, v1 and v2 are very close, so it should have a very large component of v1 in v2 (and vice versa).
  • In the second figure, they are relatively far, so it should have relatively less v1 component.
  • In the third figure, they are completely orthogonal, so v2 should have zero v1 components.

To see how much v1 component is present in v2, let’s project v1 on v2.

As we can see, v1’s projection on v2 is denoted by distance d.

Also we can see, this distance (which denotes v1 component in v2):

  • d is max, when 2 vectors are completely aligned.
  • d keeps on decreasing, as v1 and v2 falls apart.
  • d is 0, when when v1 and v2 are orthogonal vectors.
  • d is negative, when v1 and v2 further falls apart.

So if we project one vector on another and calculate distance “d” that will tell us how close 2 vectors are.
Cool! Now we have an approach, to measure similarity between 2 vectors.

Wait, we have an approach but how to calculate d? We just have coordinates for v1 and v2.

This is the time to sharpen our geometry toolbox.
(Don’t forget, our aim is to calculate distance d)

Cool! Now we have found d (single numerical value), which denotes similarity between 2 vectors (which is nothing but similarity between Tina & Mina).

Let’s go further and see if we get more insights from this.

Lot of things happening… Let’s put everything together:

  1. Algebraic Intuition of Dot-Product.
    It is simply following operation on 2 lists of numbers:
    [a,b,c].[x,y,z] = a*x + b*y + c*x
    (These 2 lists of numbers can be anything. Your and your friend’s marks or some cosmic elements numeric properties. Literally anything…)
  2. Now we are solving, completely different problem to see Geometric Intuition.

So far, there is no mention of dot-product in this Geometric intuition.

Our aim is to calculate distance d, because that will tell us, what is v1’s projection on V2 (i.e. similarity between v1 & v2)

See, we were trying to solve some geometric problem, and we happened to come across a dot-product operation like we came across other addition, subtraction operations.

Now geometrically, that particular dot product was representing cosine similarity of 2 vectors ( 2 vectors are nothing but 2 lists of coordinates representing 2 entities).

Now we know, we can represent any entity in the form of vector.

Hence, now we can make a generic statement like:

Also we have seen,

  • if 2 unit-vectors (||v1||=||v2||=1) are aligned then their dot-product is 1. (For non-unit vectors and non aligned vectors but pointing in same direction, dot-product will be some positive number.)
  • if 2 vectors are pointing in opposite direction then their dot-product is negative
  • if 2 vectors are perpendicular then their dot-product is 0 (2 vectors not at all similar

This is the Geometric intuition of dot-product.

Linear Algebraic Intuition

We know that Linear Algebra is all about Linear Transformations.
To understand what it transforms and why it transforms, let’s first understand a few basics of Linear Algebra.

When we say, a vector (-2,3) what do we mean:

credit: 3BLUE1BROWN SERIES S1 • E1

You might say, it is quite easy. When we say (-2,3), we actually mean to walk along the x-axis for 2 steps on the left side and from there walk upwards (parallel to y-axis) 3 steps. This is indeed correct.

credit: 3BLUE1BROWN SERIES S1 • E1

But there is one more way (which is quite central to Linear Algebra) to describe these vectors.

Idea is, we will define 2 vectors and we will use these 2 vectors to describe all other possible vectors in the 2D world.
Since these 2 vectors are responsible for describing the whole 2D world (all possible vectors in 2D), they are aptly called as Basis Vectors.

credit: 3BLUE1BROWN SERIES S1 • E2

As you might have rightly guessed, coordinates of these basis vectors are:

i =[1,0] and j = [0,1]

Let’s now see how to describe a vector [-5,2] with the help of these 2 basis vectors.

credit: 3BLUE1BROWN SERIES S1 • E2

I hear your question. What’s the point of doing this? We will come to that in a while.
Before that, let’s represent the above equation in a compact form.

Now let’s see, by changing the basis vectors, what impact does it have on the underlying graph grid.

credit: 3BLUE1BROWN SERIES S1 • E3

Now it is clear that, by changing the basis vectors, underlying grid changes. Now let’s see what effect this has on vectors.

Basically, how the vector will look (that is, it’s position and length) is dependent on the basis vector.
That same vector (1,2) will have different position and length for different basis vectors.

credit: 3BLUE1BROWN SERIES S1 • E3

See now by changing the basis vector, how this vector is rotated in 90 degrees.

credit: 3BLUE1BROWN SERIES S1 • E3

So the same vector [1,2] rotates in 90 degrees and becomes [-2,2] by changing the basis.

The idea is by changing the basis vector, we can have different effects like rotation, reflection, shear, stretching etc. onto the same vector.

Imagine, vectors are nothing but points (x,y) in a 2D space. If we draw some image in 2D space, let’s say, we draw a car using multiple points, then we can change the position of all these points (which are representing a car) by changing basis vectors.
If position and length changes, then we will have an effect of stretched car and rotated car.

Here, M1, M2 and M3 are the matrices which represent basis-vectors and are also called transformation matrices.

credit: https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRX4lUwHhMykC_qx3HKijt9oXsP71PoxxQa1Q&usqp=CAU

Now, I think you will appreciate the idea of Basis Vectors/Transformation Matrix.

Thus the dot-product between transformation matrix and vectors denotes Linear Transformation in Linear Algebra.

This is an intuition of a dot-product in Linear Algebra.

(Note-1: I have used terms like basis vectors and transformation matrix interchangeably for simplicity. But there is a difference between basis vectors and transformation matrices. That is not every transformation matrix is a basis vector. But we have not covered that as it is not in the scope of this article)

(Note-2: For simplicity, we have taken example of 2D space (2d vector) but this applies to 3D, ND space too)

Summary:

When you see following operation on 2 lists of numbers anywhere (Geometry, Linear Algebra, Physics or anywhere else in the world) just shout it loud — “That’s a dot-product!!!”
[a,b,c].[x,y,z] = a*x + b*y + c*x

When you see this operation with Geometric perspective this represents cosine similarity between 2 lists (vectors).

When you see this operation with Linear Algebraic perspective this represents linear transformation.

Who knows, tomorrow while solving some new problem, you may arrive at this operation in some other stream and that will become a new dot-product perspective for your stream.

That’s it. I hope you find it helpful.

--

--