Turn Right in 2 Miles: Simple Method to Find Distance Using the Google Maps API and Pandas

Svideloc
Analytics Vidhya
Published in
5 min readDec 19, 2019

Background

Enormous amounts of location data are generated every day. Uber, Lyft, Google Maps, Yelp, Lime and many others all use and generate this data on their various platforms. There’s a wealth of information waiting to be accessed, but you may find yourself needing some guidance on how to work with and gain insights from this data.

In this post, we will use the Google Maps API to identify the distance between a pick-up and drop-off location, given latitude and longitude.

We will look at the following dataframe to calculate the distances between the pick-up and drop-off:

df.head()

Before accessing the API, you will need to first create an account with the Google Maps Platform.

Note that you will have to put payment information in, but you get $200 in monthly credit for free access. Additional pricing can be found here if you need to access the API more. Look to the Distance Matrix cell.

I am using data from Kaggle, which can be found here.

Why Use Google Maps API?

Depending on what you are doing with your data, you may want to calculate the distance differently. Below, I will discuss two popular ways to calculate distance using triangles (Pythagorean theorem). The Google Maps API gives you the actual driving distance and therefore can be more precise if your models require that kind of precision.

Let’s look at an example for the Distance from New York to Houston using each method:

Euclidean Distance

Euclidean distance is calculated as the hypotenuse of a right triangle, just like in the Pythagorean theorem. This is simply a direct path from point A to point B. In the image below, this would be the black line. The euclidean distance is roughly 1,417 miles. Although not perfect, this may be a good estimate for flight distance.

Manhattan Distance

Manhattan distance is calculated as the sum of the sides of the right triangle. This is shown with the red lines in the image. The Manhattan distance is about 2,015 miles from New York to Houston. This method has its problems but could be a good estimate in grid-based cities.

Google Maps API Distance

The Google Maps API gives us the actual driving distance, just like what you would get if you were to map from New York to Houston in your Google Maps phone app. In the image, the blue line is the Google Maps API distance and is roughly 1,630 miles. This can be useful if you have Uber data or any number of other data for which you want to know the driving distance.

Depending on your purposes, each method has its advantages and disadvantages, but we will cover accessing the Google Maps API distance in the code below.

It is important to understand that this API follows the rules of the road. Therefore your latitudes and longitudes need to be as precise as possible. If you are off by just 10 feet, the API might think you are on the other side of the road and miscalculate the driving distance.

Python Code

1. Packages to Import

  • pandas: data analysis tool in Python. These can be used to manipulate data and create dataframes.
  • json: data interchange package. I use this to access my stored API key in a .json file; this is to keep my API key private.
  • requests: makes requests using the most common HTTP methods.
import json
import requests
import pandas as pd

2. Access the API with Test Try

First, you will need to access your API key. I store my API in different .json files on my computer and access them with the following code (if you have a preferred way of accessing your API key, then use that.):

def get_keys(path):
with open(path) as f:
return json.load(f)
API_key = get_keys("/Users/Documents/google_key.json")google_key = list(API_key.values())[0]

Bring in the API Url and call a test (I typically start by doing a test in Postman, and then move to my Jupyter Notebook after). The output is below the block of code:

url = f"https://maps.googleapis.com/maps/api/distancematrix/json?units=imperial&origins=40.6655101,-73.89188969999998&destinations=40.6905615%2C-73.9976592&key={google_key}"r = requests.get(url)
data = r.json()
data

Okay, looks like the API was successful. We can see the information in the output above. For this particular example, I am only interested in the parameter that shows me ‘6.5 mi’ — or the distance of the trip. We can access that specific item with the following code:

data['rows'][0]['elements'][0]['distance']['text']

This will output the string: ‘6.5 mi’

We will deal with the string later.

3. Accessing API for All Values in the DataFrame

Now that we know how to access the API, we will want to do this for each row in the dataframe. I first turn each lat/long into their own list and loop through the API for each of the list items.

lat_origin = df['pickup_latitude'].tolist()
long_origin = df['pickup_longitude'].tolist()
lat_destination = df['dropoff_latitude'].tolist()
long_destination = df['dropoff_longitude'].tolist()
distance = []
for i in range(len(long_destination)):
url = f"https://maps.googleapis.com/maps/api/distancematrix/json?units=imperial&origins={lat_origin[i]},{long_origin[i]}&destinations={lat_destination[i]}%2C{long_destination[i]}&key={google_key}"
r=requests.get(url)
data = r.json()
try:
distance.append(data['rows'][0]['elements'][0]['distance']['text'])
except:
pass
distance

From the output above you can see that we now have a list of distances in the order that we accessed them from the API.

4. Cleaning the Data & Adding to Dataframe

Finally, we will remove the ‘ mi’ from each item in the list and then place the new list into our original dataframe.

distance2 = []
for i in range(len(distance)):
distance2.append(float((distance[i].replace(' mi', ''))))
df['Distance_in_Miles'] = distance2df.head()

From above, you can see the original dataframe with the lat/long data, as well as the new column that we created, which shows the driving distance of each trip.

There are many ways to find distance between locations, but it is imperative that you understand which way will add the most value to a project or model that you are working on. The Google Maps API can be a useful tool in order to find the driving distance between two points.

--

--