Why Manhattan distance formula doesn’t apply to Manhattan

JK Vijayaraghavan
5 min readFeb 16, 2020

--

Manhattan distance is a well-known distance metric inspired by the perfectly-perpendicular street layout of Manhattan. It also provides a complementary or a substitute distance metric along with the Euclidean distance. Calculating Manhattan distance in a Cartesian Coordinate system (where you assume your origin is (0, 0) ) is pretty straight-forward. The formula for calculating Manhattan distance goes something like this.

d = |x1 — x2| + |y1 — y2|

In plain English, this translates to- Manhattan distance is the total distance you traveled in all the east-west running streets as well through all the north-south running streets.

Photo by Yoav Aziz on Unsplash

This makes perfect sense in a place like Manhattan or any other place where the streets are perfectly perpendicular to each other. So, if we have the X & Y coordinates of our Start location(S) and Destination (D), then we can apply the above formula and derive the Manhattan distance. Not so fast. In practice, we have two issues:

  • The coordinates of S & D are in latitude and longitudes, which is NOT in a Cartesian coordinate system. This is what most GPS systems return.
  • Manhattan streets are inclined at about 29 degrees to the True north.

This means that the above Manhattan distance formula doesn’t simply apply to Manhattan.

The three worlds representing Manhattan streets

But there’s a better way (& world)

Imagine WORLD 2 as depicted in the above illustration; a world where the lines of longitude perfectly align with the north-south running streets and the lines of latitude aligns perfectly with the east-west running streets. In this world, let’s call the starting point and destination point as S’ and D’ (depicted incorrectly in the illustration as S & D; I’m too lazy to rescan the illustration). In this world, if we can find the Hinge point (H’) where the longitudinal line running through S and the latitudinal line running through D’ intersect, we can find the total distance as the sum of straight-line distances between S’ to H’ and between H’ and D’.

And it’s easy to do so.

Equation A:

longitude of H’ = longitude of S’

latitude of H’ = latitude of D’

Equation B:

Total distance = Straight line distance from S to H + Straight line distance from H to D

Let’s take a detour

Before delving deeper into the Manhattan distance calculation, let’s understand what the straight line distance formula looks like in WORLD3(in a world with latitudes and longitudes). The formula to calculate the straight line distance between any two points (with a corresponding pair of coordinates (lat1, lon1) & (lat2, lon2)) is known as Haversine distance; the implementation for the same in Python is as follows:

import numpy as npdef haversine(lat1, lon1, lat2, lon2):     
R = 3958.76 # Earth radius in miles
dLat, dLon, lat1, lat2 = np.radians(lat2 - lat1), \
np.radians(lon2 - lon1), \
np.radians(lat1), \
np.radians(lat2)

a = np.sin(dLat/2) ** 2 + \
np.cos(lat1) * np.cos(lat2) * np.sin(dLon/2) ** 2
c = 2 * np.arcsin(np.sqrt(a))
return R * c

Back to Manhattan

Since we know that the streets in Manhattan are inclined to the Geographic North by about 29 degrees (as some cosmic truth, and hence we shan’t question this fact), we might have to rotate S and D anti-clockwise by 29 degrees. Taking the dot product of a coordinate and the following rotation matrix will rotate the coordinates into WORLD2.

Similarly, we can calculate the D’ (lat, lon)

With the latitudes and longitudes of S’ and D’, we can find the coordinates of Hinge point H’ (in WORLD 2) as per Equation A. Now, if we rotate the H’ clockwise (+29 degrees), we can obtain the Hinge coordinates H in WORLD3.

Now, if we take the sum of the straight line distance between S to H and from H to D.

The python implementation for the same is as follows:

def manhattan_dist(lat1, lon1, lat2, lon2):

# Pickup coordinates
p = np.stack([lat1, lon1], axis = 1)

# Dropoff coordinates
d = np.stack([lat2, lon2], axis = 1)

theta1 = np.radians(-28.904)
theta2 = np.radians(28.904)

## Rotation matrix
R1 = np.array([[np.cos(theta1), np.sin(theta1)],
[-np.sin(theta1), np.cos(theta1)]]
)
R2 = np.array([[np.cos(theta2), np.sin(theta2)],
[-np.sin(theta2), np.cos(theta2)]]
)

# Rotate Pickup and Dropoff coordinates by -29 degress in World2
pT = R1 @ p.T
dT = R1 @ d.T

# Coordinates of Hinge point in the rotated world
vT = np.stack((pT[0,:], dT[1,:]))
# Coordinates of Hinge point in the real world
v = R2 @ vT
""" Finally,

Manhattan distance
=
Haversine dist between Pickup & Hingept
+
Haversine dist between Hinge pt & Dropoff location
"""

return (haversine(p.T[0], p.T[1], v[0], v[1]) +
haversine(v[0], v[1], d.T[0], d.T[1])
)

~~ Fin ~~

P.S: Since the above functions fully leverage numpy functions, we can send an entire series of coordinates (such as a pandas series or numpy array), as opposed to executing it sequentially. Executing the function as follows speeds up the operations 100x.

df[“manhattan_dist”] = manhattan_dist(
df[“pickup_latitude”], df[“pickup_longitude”],
df[“dropoff_latitude”], df[“dropoff_longitude”]
)

A GitHub gist of the code can be found here: https://bit.ly/2SxhiHi

Happy coding! 🙌

--

--

JK Vijayaraghavan

Data Scientist @ Walmart | Causal Inference, Experimentation & Product Analytics