Why you should be using GeoPandas to Visualize Data on Maps (aka Geo-Visualization)

Nick Minaie, PhD
Jun 19 · 6 min read

Data visualization is one of the most important steps in presenting data in a project, whether it is to your team, your managers, or your clients or stakeholders. Data visualization is also a critical element of Exploratory Data Analysis (EDA), and can save you a lot of time in a project.

According to SaS:

“Data visualization is the presentation of data in a pictorial or graphical format. It enables decision makers to see analytics presented visually, so they can grasp difficult concepts or identify new patterns. With interactive visualization, you can take the concept a step further by using technology to drill down into charts and graphs for more detail, interactively changing what data you see and how it’s processed.”

Visualizing NCAA College Football Fans on the US Map (https://www.nytimes.com/interactive/2014/10/03/upshot/ncaa-football-fan-map.html)

But why does data visualization matter?

In an article on the Search Engine Journal, Adam Heitzman says, “We are an inherently visual world, where images speak louder than words. Data visualization is especially important when it comes to big data and data analyzation projects.”

With massive amount of data the is generated every day, data visualization has become important more than ever to tell the story, and to help in getting insight about data, as opposed to looking over pages and pages of tabular data; a task that may seem daunting in many cases, or even impossible in some cases. Adam Heitzman, in the same article mentions, “The results from complex algorithms are much easier to understand in a visual format as opposed to lines and lines of text and numbers.”

GeoPandas for Data Visualization: Geo-Visualization

GeoPandas documentation is a great resource for learning about this powerful library. The site mentions:

“GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and descartes and matplotlib for plotting… GeoPandas enables you to easily do operations in python that would otherwise require a spatial database such as PostGIS.”

How does GeoPandas work in Jupyter Notebook?

First you need to install in your Jupyter Notebook using :

gives you a warning that geopandas must be installed along with all required dependences. To ensure better functionality, especially for plotting, other dependencies are suggested, as below:

Required dependencies:

Further, optional dependencies are:

  • rtree (optional; spatial index to improve performance and required for overlay operations; interface to libspatialindex)
  • psycopg2 (optional; for PostGIS connection)
  • geopy (optional; for geocoding)

For plotting, these additional packages may be used:

The example outlined in this article works with a shape file ( format ) that can be downloaded from . The code below refers to the relative path for the file assuming it is in the working directory. It should be noted that you will need all related files in the same directory for the file to load properly. These files are listed below.

Files associated with .shx shape file

This should import the data and store it in the DataFrame. Using method on this, we can see the columns and the first five rows.

We should first check the data types in the DataFrame.

The first column is the name of the state, and the second column, is the sequence of drawing on the plot. For example, the plot will draw Hawaii first and then then Washington, Montana, and so on.

The next column is columns, which stands for State Federal Information Processing Standard (FIPS) code. According to Wikipedia,

“FIPS Codes were numeric and two-letter alphabetic codes defined in U.S. Federal Information Processing Standard Publication (“FIPS PUB”) 5–2 to identify U.S. states and certain other associated areas.”

is the next column, and includes the nine sub-regions in the US. You can get the list as below. This column is followed by column which includes the two character abbreviations for the states.

But what sets apart from is the column that includes the details of state (or any city, county, region, country, etc.) border. This column is known to and would not work in . To illustration below show how this works:

Plotting Washington State using ‘geometry’ Column in the Map GeoPandas DataFrame

Next if defining our data, and what state we want to plot (or not). We can do this using any of the columns. For example, using is used below to filter the data:

One we have the data, we can plot the map:

And here is the map:

You can customize the map by adding, for example, a color bar:

More customization possible? Of course! You can use your creativity and add state names to the map, or plot other information.

How can this map be used to visualize some data, say ‘US Median Household Income’?

First you need to import your data into a DataFrame, and then clean your data, and format your DataFrame so it has a column, plus the other columns you want to visualize. One you have that, you can then merge that DataFrame with your map GeoPandas DataFrame using the column.

Note: You should always add your Pandas DataFrame to your GeoPandas DataFrame, and not the other way around.

Then you can simply replace in the code above with the column you want to visualize. For example, if we added data to our map_df and called it :

Visualizing ‘2018 SAT Participation` Data on the US Map

Plotting US County-Level Maps

Using the same technique, we can plot any other maps, as long as we have the shape file. For example, the shape file for the US counties, can be downloaded from the www.data.gov website.

Have Fun, Mapping!

There is really no limit in what you can do with GeoPandas coupled with other Python libraries. Have fun!

Nick Minaie, PhD

Written by

Data Scientist, Machine Learning and Artificial Intelligence Expert, Sr. Consultant, and Innovation Enthusiast