Different types of distance used in Machine Learning.

5 min readNov 5, 2019

A number of Machine Learning Algorithms — Supervised or Unsupervised, use Distance Metrics to know the input data pattern in order to make any Data-Based decision. A good distance metric helps in improving the performance of Classification, Clustering, and Information Retrieval process significantly.

There are a number of distance metrics, some are discussed below:

First, if we want to find the distance between two points (of any dimensions) we use Euclidean distance or Manhattan distance basically to find whether the two points are nearer so we can consider whether they are similar or not.

1. EUCLIDEAN DISTANCE :

The Euclidean distance between two points is the length of the path connecting them or we can say that it is the distance between two data points in a plane. The Pythagorean theorem gives this distance between two points.

Euclidean Distance between two points A(x1, y1) and B(x2, y2) is(AB=d):

So after knowing what Euclidean distance is and how it is calculated, the question arises that where we use Euclidean distance? for example, you want to travel(by airways) from one place to another the distance calculated is in terms of only both the points(source point and destination point) i.e. by Pythagoras theorem.

2. MANHATTAN DISTANCE :

Euclidean Distance Vs Manhattan Distance

We use Manhattan Distance when we need to calculate the distance between two data points in a grid-like path. The distance is calculated using an absolute sum of the difference between its cartesian coordinates as below.

Manhattan Distance between two points A(x1, y1) and B(x2, y2) is(AB=d):

d=|x1 — x2| + |y1 — y2|

similarly here comes the question where we use Manhattan distance? so in the given fig if the blocks represent the buildings and the gridline represents the road connecting the buildings. If we want to go from building A to B this time we won’t be considering the Pythagoras theorem to find the distance instead we will be calculating the distance by the roadways as we will be going from one building to another by the road not by climbing or hoping upon the buildings.

3. MINKOWSKI DISTANCE:

Minkowski distance is a distance between two points in the normed vector space (N-dimensional real space). By normed vector, we mean that “in a space where distances can be represented as a vector that has a length”. It is a generalization of the Euclidean distance and the Manhattan distance.

We basically represent Minkowski as Lp norm. Where p is a real number such that p≥ 1. It represents the Manhattan distance when p= 1 (i.e., L1 norm) and Euclidean distance when p= 2 (i.e., L2 norm).

L1 Norm :

p=1 i.e L1 norm is Manhattan Distance is the sum of the magnitudes of the vectors in space. It is the most natural way of measuring the distance between vectors.

for example, the vector X = [3,4]. The L1 norm is the distance to travel between the origin (0,0) to the destination (3,4), this resembles the way Manhattan Distance is calculated. So the L1 norm is calculated as given below.

L2 Norm :

p=2 i.e L2 norm is Euclidean Distance it is usually known as the Euclidean norm. It is the shortest distance to go from one point to another.

Again we are taking the same example i.e. X=[3,4]. The L2 norm is the shortest distance calculated which resembles the way the Euclidean Distance is calculated. So the L2 norm is calculated as given below.

There is one consideration to take with L2 norm, each component of the vector is squared, and that means that the outliers have more weighting, so it can skew results.

4. HAMMING DISTANCE :

It is a metric for comparing two binary data strings. While comparing two binary strings of equal length, Hamming distance is the number of bit positions in which the two bits are different. It is used for error detection or error correction when data is transmitted over computer networks. It is also using in coding theory for comparing equal length data words. The Hamming distance between two strings, a and b is denoted as d(a,b).In order to calculate the Hamming distance between two strings, and, we perform their XOR operation, (a⊕ b), and then count the total number of 1s in the resultant string. In order to calculate the Hamming distance between two strings, and, we perform their XOR operation, (a⊕ b), and then count the total number of 1s in the resultant string.

For example, Suppose there are two strings 1101 1001 and 1001 1101.

11011001 ⊕ 10011101 = 01000100. Since, this contains two 1s, the Hamming distance, d(11011001, 10011101) = 2.

5. COSINE DISTANCE AND COSINE SIMILARITY :

When the distance between two points A and B decreases the similarity between both points increases and vice-versa.

Cosine similarity basically says that if we want to find the similarity between two points then we have to find the angle between their vectors. The smaller the angle, the higher the cosine similarity (Cosine lies between -1 to 1). Both cosine distance and cosine similarity are dependent on each other.

sim(A,B) represents cosine-similarity between Point A and B

(cosine-distance) = 1-(cosine-similarity)

Cosine similarity and cosine distance are widely used in the recommendation system.

Different types of distance used in Machine Learning.

d=|x1 — x2| + |y1 — y2|

L1 Norm :

L2 Norm :

Written by Anjali Kumari