Graphing Latitudes and Longitudes using Python

Ian Forrest
4 min readAug 8, 2019

--

A beginner’s introduction to mapping with the GeoPandas library

Apartment Rental Prices in Manhattan — made using GeoPandas

At first, converting latitudes and longitudes in a dataset to points on a map seems like a daunting task.

However, Python’s GeoPandas library exists for this exact purpose, amongst many others.

This article is a brief introduction into converting latitudes and longitude features into point features, and then graphing those point features using GeoPandas!

Initial Data Import

Adding latitude and longitudes to a map in Python involves two processes:
- import data file containing latitude and longitude features
- import map image as .shp file

import numpy as np
import pandas as pd
# Read New York City apartment rental listing data
df = pd.read_csv(‘../data/renthop-nyc.csv’)
assert df.shape == (49352, 34)
# Remove the most extreme 1% prices,
# the most extreme .1% latitudes, &
# the most extreme .1% longitudes
df = df[(df[‘price’] >= np.percentile(df[‘price’], 0.5)) &
(df[‘price’] <= np.percentile(df[‘price’], 99.5)) &
(df[‘latitude’] >= np.percentile(df[‘latitude’], 0.05)) &
(df[‘latitude’] < np.percentile(df[‘latitude’], 99.95)) &
(df[‘longitude’] >= np.percentile(df[‘longitude’], 0.05)) &
(df[‘longitude’] <= np.percentile(df[‘longitude’], 99.95))]

First, we will import apartment rental data for New York City for months April, May, and June of 2016. The data comes from renthop.com, and the initial code comes courtesy of Ryan Herr:
- import numpy for enhanced number manipulation ability
- import pandas for enhanced dataframe manipulation ability
- read data file into workbook using Pandas read_csv() functionality and assign it to a df variable
- last step is specific to this dataset; User implemented numpy’s .percentile() functionality to limit the dataset’s price, latitude, and longitude outliers

Calling the df.head() function will display our initial dataframe. Note the latitude, longitude and price columns of the rental listings; they will come into play later.

Next, we must import our map as a .shp file. Since we are graphing points representing NYC rental listings, it will probably be useful to use a NYC map! One can be found at https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm:

Downloading New York City map as .shp file

.shp files will often come in a zipped file containing three other file types:
- .dbf file
- .shx file
- .prj file
For the map import to work, all four files must be stored in the same directory. If pointing to a local directory, make sure all files are stored in the same folder. If using a notebook like Google Colab, make sure all files are uploaded:

.shp and related files in local directory
.shp and related files in Google Colab workbook

Once downloaded and organized, it’s time to import our .shp file! This is accomplished using the GeoPandas library, the .read_file() function to be specific. We will assign it to the variable street_map. We will also import shapely.geometry's Point/Polygon functions and matplotlib.pyplot here, which will be used later:

# import libraries
import geopandas as gpd
from shapely.geometry import Point, Polygon
import matplotlib.pyplot as plt
# import street map
street_map = gpd.read_file(‘/content/geo_export_1f88d1b8–51fd-42aa-84b0–22d7bad6bc6f.shp’)

Creating GeoPandas DataFrame

# designate coordinate system
crs = {‘init’:’espc:4326'}
# zip x and y coordinates into single feature
geometry = [Point(xy) for xy in zip(df[‘longitude’], df[‘latitude’])]
# create GeoPandas dataframe
geo_df = gpd.GeoDataFrame(df,
crs = crs,
geometry = geometry)

Once the map and data files are stored, its time for the next steps:
- designate coordinate reference system and assign it to crs variable. For this example we will be using ‘ESPC 4326’. For more information visit http://geopandas.org/projections.html
- add ‘geometry’ column to dataframe. ‘geometry’ column contains the dataframe’s ‘latitude’ & ‘longitude’ columns zipped together using shapely.geometry's Point function
- create GeoPandas dataframe! This is accomplished using GeoPandas’ .GeoDataFrame() function, which takes the dataframe df, crs coordinates crs, and our new geometry file geometry as inputs

geo_df.head() shows us our new GeoDataFrame with the ‘geometry’ column added:

Time to graph!

# create figure and axes, assign to subplot
fig, ax = plt.subplots(figsize=(15,15))
# add .shp mapfile to axes
street_map.plot(ax=ax, alpha=0.4,color=’grey’)
# add geodataframe to axes
# assign ‘price’ variable to represent coordinates on graph
# add legend
# make datapoints transparent using alpha
# assign size of points using markersize
geo_df.plot(column=’price’,ax=ax,alpha=0.5, legend=True,markersize=10)
# add title to graph
plt.title(‘Rental Prices in NYC’, fontsize=15,fontweight=’bold’)
# set latitiude and longitude boundaries for map display
plt.xlim(-74.02,-73.925)
plt.ylim( 40.7,40.8)
# show map
plt.show()

Steps:
1.) create figure, add axes onto figure using fig, ax = plt.subplots()
2.) add street_map to axes. Remember, street_map contains our .shp file
3.) add geo_df to axes.
column='price' tells Python geometric points with geo_df's ‘price’ column on the map
4.) add title, set latitude and longitude limits, and show graph!

That’s all there is to it! Big thanks to Ryan Stewart and his article https://towardsdatascience.com/geopandas-101-plot-any-data-with-a-latitude-and-longitude-on-a-map-98e01944b972 for the inspiration!

--

--