Mina Jambajantsan
6 min readOct 28, 2019

Maps using Folium, Matplotlib and Geopanda

Image by Nasa

Three ways to create great maps with Python libraries

Probably many aspiring Data Scientists familiar with the King County house sales dataset. This dataset contains records of 21,613 houses sold in King County between May 2014 and May 2015. The dataset also contains 21 different variables such as price, GPS coordinates, zip code, number of bedrooms, area of the living space, area of the lot, number of views, and so on, for each house.

Data visualising is essential in order to have a better overview of our data. It makes the data more accessible, understandable and usable (https://en.wikipedia.org/wiki/Data_visualization). We have access to many kinds of tools starting with simple bar charts, scatter plots to more complex ones like a 3D graph.

In this post I will show maps made using Folium, Matplotlib and GeoPanda.

First, let’s import all our necessary libraries:

After loading the dataset, cleaning the necessary columns and data, it’s important to check the credibility of given data. We might not be able to check columns like price, number of bathrooms or year of renovation, but for example we can check if the waterfront houses are indeed on the waterfront. First, we will create a waterfront DataFrame, then use the Folium Library to mark those houses.

We can apply different tiles for Folium: https://python-graph-gallery.com/288-map-background-with-folium/, here I have used the OpenStreetMap:

It looks like our data is valid, although if we want to be thorough, we might want to check the waterfront column more carefully (Null values).

Also if you would like to upgrade your map, so it will show the price and sqft of living space, you could use the popup markers:

The Folium Library gives us the opportunity of another great maps as cluster maps or heat map. I will use an interactive cluster map to observe the location of the most expensive top 25% of houses sold that year. We could have a closer look at each marker, but this cluster map makes more sense, if we use it as a density map to locate areas with larger number of expensive houses. If we would be looking for an expensive area to move, this map can give us a hint which area to look at. First, we need to determine the top 25%, create a DataFrame for these houses, and plot a new map:

Which looks like this (given in series as you zoom in):

On the first image we can see the total number of expensive houses (top 25%).

The second image shows us 2 large areas with 2096 and 2715 houses, also the blue line indicates the area for the given number of houses. The last two images are zoomed in. A quick note for this kind of interactive cluster map: loading more than 5–6 000 points most likely will slower down your computer, and may take longer time for it to process.

Let’s move on to Matplotlib. We can have many useful information on Matplotlib starting from documentation to tutorials on their official website: https://matplotlib.org/.

Since we have date of built of the sold houses, using this library I would like to show how the city of Seattle was expanding over the years.

The first house according to our dataset was built in 1900, and the last one was built in 2015. Before the World War II we can see a big decline in the industry, and then a steady growth, and then again, a slight decline in 1980s.

Using this frequency map, I have separated 3 big periods:

1. Houses built from 1900 to 1940

2. Houses built from 1940 to 1980

3. Houses built from 1980 to 2015

Here I have converted the 3 periods into 3 bins:

And created a map using Matplotlib. This code is for the general map where all 3 periods are on one map:

Please note, that I have given the alpha=0.2 for the last two bins in order to make the a little bit more transparent, so we can see the early period as well. And here the map with the subplots:

Plotting all 4 maps (1 general map with all periods + 3 separate periods), we can see how the town was spreading through the years:

Using GeoPandas we can make the general map even more detailed. Here is a great tutorial which helped me a lot: https://towardsdatascience.com/geopandas-101-plot-any-data-with-a-latitude-and-longitude-on-a-map-98e01944b972.

For this map I have reversed the plotting of houses by years, because as we can see from our previous maps, the area of historic houses is much smaller, as the next generation of houses. In order to use a basemap we need to install Descartes and GeoPandas, import them, and use shapefiles, which you will need to download:

Don’t forget to give the Coordinate Reference System (CRS) as epsg:4326 in order to secure our base map. See more about CRS here: https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pdf.

And here we can see the added geometry column with individual points for each row:

Last, but not least, let’s plot the “Houses built by years” map with a basemap:

And we can have our last map where we can the growth of Seattle:

I hope you found this post useful and it helps you working with these great Python libraries.

And if you are still with me, as a bonus, may I bring to your attention a professional map building software, the QGIS, a Free And Open Source Geographic Information System: https://www.qgis.org/en/site/

From the many useful things QGIS can help you to create, I was able to make a contour map from an elevation map of this beautiful 10–11th century Khitan town remains with 17th century Buddhist Monastery buildings:

And the contour map:

Sources:

Data Vizualisation:
https://en.wikipedia.org/wiki/Data_visualization

Map backgrounds with Folium:
https://python-graph-gallery.com/288-map-background-with-folium/

Matplotlib:
https://matplotlib.org/

GeoPandas tutorial:
https://towardsdatascience.com/geopandas-101-plot-any-data-with-a-latitude-and-longitude-on-a-map-98e01944b972

Coordinate System Reference: https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pdf

QGIS:
https://www.qgis.org/en/site/

Khitan town ortophoto:
Drone photos were made by me and later “stitched” to an orthomap for the elevation map