Essential Math for Machine Learning: The Cosine Law and Similarity Measures

Dagang Wei
5 min readFeb 8, 2024

--

Image generated with Gemini

This article is part of the series Essential Math for Machine Learning.

Introduction

Machine learning algorithms often rely on understanding how similar or dissimilar different data points are. Whether it’s recommending movies, clustering customer profiles, or detecting similar-looking images, the concept of similarity lies at the heart of many ML tasks. In this blog post, we’ll explore three fundamental similarity measures and examine when they’re most appropriate to use:

  • Dot Product
  • Cosine Similarity
  • Euclidean Distance

Understanding Similarity

Intuitively, when we say two objects are similar, we imply that they share certain characteristics or features. Mathematically, we can represent objects as vectors in a multi-dimensional space. Each dimension represents a feature. Similarity measures are tools that take these vector representations and calculate a numerical score quantifying how close or alike those vectors are.

Dot Product

source

The dot product of two vectors measures both the similarity and the difference in magnitude between the vectors. Here’s how it works:

Formula: If a = [a1, a2, …, an] and b = [b1, b2, …, bn] are two vectors, then their dot product is calculated as: a · b = a1b1 + a2b2 + … + an*bn

Geometric Meaning: The dot product is proportional to the product of the magnitudes of the two vectors and the cosine of the angle between them.

When to Use: The dot product is useful when you’re interested in both the similarity of direction and the strength of the relationship between your vectors. Higher dot products mean two vectors point in more similar directions.

Python Code

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, -1, 0])
dot_product = np.dot(a, b)
print(dot_product) # Output: 2

Cosine Similarity

source

Cosine similarity focuses purely on the angle between two vectors, giving a score of how closely aligned the vectors are, regardless of their magnitudes.

Formula: If a and b are two vectors, the cosine similarity is: cos(θ) = (a · b) / (||a|| ||b||) where: * ||a|| is the magnitude (length) of vector a * ||b|| is the magnitude (length) of vector b

Interpretation: Cosine similarity values range from -1 to 1. A value of 1 indicates perfect similarity (vectors point in the same direction). 0 means the vectors are orthogonal (no similarity). -1 signifies opposite directions.

When To Use: Use cosine similarity when you care about the orientation of the vectors, while discounting differences in their magnitudes. This is common in text analysis, where document length discrepancies shouldn’t overly diminish similarity scores.

Python Code

import numpy as np

a = np.array([1, 2, 3])
b = np.array([2, 4, 6]) # Notice b has twice the magnitude of a
cosine_sim = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
print(cosine_sim) # Outputs: 1.0

Euclidean Distance

source

Euclidean distance is the familiar “straight-line” distance between two points in multi-dimensional space.

Formula: The Euclidean distance between vectors a and b is: ||ab|| = sqrt((a1 — b1)² + (a2 — b2)² + … + (an — bn)²)

Interpretation: Larger Euclidean distances indicate greater dissimilarity. Euclidean distance is sensitive to the overall magnitude of the vectors.

When to Use: Euclidean distance is suitable when the relative scale of the features matters and when finding the most similar instances (in terms of pure geometric distance) is desirable.

Python Code

import numpy as np

a = np.array([1, 2, 3])
b = np.array([5, 0, 1])
euclidean_dist = np.linalg.norm(a - b)
print(euclidean_dist) # Output: 4.4721...

The Cosine Law Connection

The Law of Cosines provides a fundamental link between these three similarity measures. Recall the Law of Cosines, which relates the sides and an angle of any triangle:

c² = a² + b² — 2ab cos(θ)

where:

  • c is the length of the side opposite angle θ
  • a and b are the lengths of the other two sides
  • cos(θ) is the cosine of angle θ

Let’s see how this connects to our similarity measures:

Euclidean Distance and the Cosine Law: If we represent vectors a and b as two sides of a triangle, the vector a — b represents the third side. Substituting into the Law of Cosines we get:

||a — b||² = ||a||² + ||b||² — 2 ||a|| ||b|| cos(θ)

Notice the appearance of the magnitudes of a and b and the cosine of the angle θ between them.

Cosine Similarity: Isolating cos(θ) in the above equation leads us directly to the formula for cosine similarity:

cos(θ) = (a · b) / (||a|| ||b||)

The Law of Cosines underpins the relationships between Euclidean distance, dot product, and cosine similarity. Essentially, these similarity measures offer different ways to express the relationships between triangle sides and their included angle.

Geometric Proof of the Law of Cosines

Triangle Setup: Let’s consider an arbitrary triangle ABC, where angle θ is at vertex C. Draw an altitude from vertex B, reaching side AC at point D. This divides side AC into two segments, one of length x and the other of length (a — x), where ‘a’ is the length of AC.

Pythagorean Theorem (twice):

In right triangle BCD: b² = h² + x²

In right triangle ABD: c² = h² + (a — x)²

Isolate h² from the first equation: h² = b² — x²

Substitute this expression for h² into the second equation:

c² = (b² — x²) + (a — x)²

Expand the squared term:

c² = b² — x² + a² — 2ax + x²

Notice that -x² and x² cancel out.

c² = a² + b² — 2ax

Trigonometry with Right Triangle BCD:

cos(θ) = x / b (cosine definition in a right triangle)

x = b cos(θ)

Final Substitution: Substitute this value of x into the simplified equation:

c² = a² + b² — 2ab cos(θ)

Algebraic Proof of the Law of Cosines

c = a - b

so the dot product of c and itself is:

c · c = (a - b) · (a - b)

dot product is distributive

(a - b) · (a - b) = a · (a - b) - b · (a - b) = a · a + b · b - 2 (a · b)

so

c · c = a · a + b · b - 2 (a · b)

Conclusion

Choosing the right similarity measure is essential for building effective machine learning models. Dot product, cosine similarity, and Euclidean distance each offer strengths depending on whether you care about overall magnitudes, directions, or a combination of both. Remember the relationships between these measures as they can help inform your decision in various ML applications.

--

--