Recently I found myself analysing data containing geographical information only in the form of latitude and longitude. Following some research I came across the solution: Google Maps paid API. But what about poor students like me? There is another less known solution for that, which I will explain in this article.
Let’s start by taking a look at the actual problem you are likely facing when reading this. Your data looks something like this:
In this particular case, the dataset contained violent incidents in NYC and my goal was to narrow down the geographical data to ZIP codes. Basically achieving the following:
How did we get there? Well let’s look at the actual python code.
# Import packages
from uszipcode import SearchEngine
search = SearchEngine(simple_zipcode=True)
from uszipcode import Zipcode
import numpy as np
Now to the real deal: the search function. This function can be manually adapted to your needs (e.g., getting the full address instead of just ZIP codes)
#define zipcode search function
def get_zipcode(lat, lon):
result = search.by_coordinates(lat = lat, lng = lon, returns = 1)
return result[0].zipcode
#load columns from dataframe
lat = df[‘Latitude’]
lon = df[‘Longitude’]
#define latitude/longitude for function
df = pd.DataFrame({‘lat’:lat, ‘lon’:lon})
#add new column with generated zip-code
df[‘zipcode’] = df.apply(lambda x: get_zipcode(x.lat,x.lon), axis=1)
That’s all it is. Your (rather) quick and simple method to reverse engineer geo-data in python without leveraging any paid service or API.