How to Plot a Map in Python

Ben Geissel
Analytics Vidhya
Published in
3 min readDec 17, 2019

Using Geopandas and Geoplot

At my previous job, I had to build maps quite often. There was never a particularly easy way to do this, so I decided to put my Python skills to the test to create a map. I ran into quite a few speed bumps along the way, but was eventually able to produce the map I intended to make. I believe with more practice, mapping in Python will become very easy. I originally stumbled across Geopandas and Geoplot for mapping, which I use here, however there are other Python libraries out there that produce nicer maps, such as Folium.

Decide What To Map

First, you have to decide what you would like to map and at what geographical level this information is at. I am interested in applying data science to environmental issues and sustainability, so I decided to take a look at some National Oceanic and Atmospheric Administration (NOAA) county level data for the United States. I specifically chose to look at maximum temperature by month for each county.

Second, you need to gather your data. From the NOAA climate division data website, I was able to pull the data I needed by clicking on the “nClimDiv” dataset link. After unzipping this data into a local folder I was ready to move on for now.

Third, you need to gather a proper Shapefile to plot your data. If you don’t know what a Shapefile is, this link will help to explain their purpose. I was able to retrieve a United States county level Shapefile from the US Census TIGER/Line Shapefile Database. Download the proper dataset and store in the same local folder as the data you want to plot.

Map Prepwork

Shapefile

As mentioned above, I used the python libraries Geopandas and Geoplot. I additionally found that I needed the Descartes libraries installed as well. To install these libraries I had to run the following bash commands from my terminal:

Now you will be able to import these libraries as you would with any other python library (e.g. “import pandas as pd”). To load in the Shapefile you can use the following Geopandas (gpd) method:

Data file

To load in the county level data, I had a few more problems to solve. The file came from NOAA in a fixed width file format. For more information on fixed width file formats checkout the following website. I followed these steps to get the data into a workable format:

Additionally, there was quite a bit of data cleaning involved, but I’ll give you a short overview. I wanted to filter the Shapefile to just be the contiguous United States, so I need to filter out the following state codes:

  • 02: Alaska
  • 15: Hawaii
  • 60: American Samoa
  • 66: Guam
  • 69: Mariana Islands
  • 72: Puerto Rico
  • 78: Virgin Islands

Let’s take a first look at the Shapefile:

You can see all the counties in the contiguous United States.

Merging the Shapefile and Dataset

The Shapefile and the Dataset need to have a column in common in order to match the data to map. I decided to match by FIPS codes. To create the FIPS codes in the Shapefile:

To create the FIPS codes in the county data (Note: I filtered the data to only the year 2018 for simplicity):

Finally, to merge the Shapefile and Dataset:

Finally, we get to map the data to the Shapefile. I used the geoplot.choropleth method to map the maximum temperature data on a scale. The darker the red, the hotter the maximum temperature was for a given county. The map was created for August 2018.

You can see we were able to plot the data on the county map of the US! I hope this demonstration helps!

Unfortunately you can see there is missing data. Additionally, I was able to generate a legend, but it would show up as about twice the size of the map itself, so I decided to remove it.

Originally published at http://github.com.

--

--