Interactive choropleth maps with GeoPandas and Folium

Lukas Kriesch
4 min readApr 15, 2024

--

Geospatial data occupies a unique niche in data science, characterised by its complex structures and the powerful insights it can reveal. This tutorial shows how to process and visualise geographic data using Python, highlighting the capabilities of powerful libraries such as geopandas and folium.

Specifically, we will recreate the interactive map featured in my latest research paper, which visually explores the distribution and scale of bioeconomy firms across Germany. You can find a interactive version of the map here.

Prerequisites

Before beginning, ensure you have the following:

  • Python 3.8 or newer installed.
  • GeoPandas, Folium, Pandas, and other necessary packages installed. You can install these with pip:
pip install geopandas folium pandas

Data Description

We will use two main datasets:

  1. Bioeconomy Firms Data: This dataset contains information about bioeconomy-related firms in Germany, including their location (NUTS3 regions), count, and classification into hightech and non-hightech. Find a detailed data description here.
  2. Geographical Data (Shapefile): Provided by Eurostat, this includes the geographical boundaries of NUTS3 regions in Germany.

Setup and Data Loading

Our journey begins with the essential imports. We use the dataset we released with the article.

import geopandas as gpd
import pandas as pd
from shapely import wkt
import folium
from folium.features import GeoJsonTooltip
from folium.plugins import FloatImage
import base64
import requests
from zipfile import ZipFile
from io import BytesIO

url="https://osf.io/download/xu4e7/"
df=pd.read_csv(url)

# URL to the NUTS 2021 GIS data in Shapefile format provided by Eurostat
url = "https://gisco-services.ec.europa.eu/distribution/v2/nuts/download/ref-nuts-2021-01m.shp.zip"

# Send a request to the URL to download the zip file
response = requests.get(url)
zip_file = ZipFile(BytesIO(response.content))

# Extract the shapefile corresponding to NUTS level 3 (you may need to adjust the filename based on the contents)
zip_file.extractall("nuts_shapefiles")

# Load the shapefile using Geopandas
kshape = gpd.read_file("nuts_shapefiles/NUTS_RG_20M_2021_4326_LEVL_3.shp")

# Filter for Germany's NUTS3 regions
kshape = kshape[(kshape["LEVL_CODE"] == 3) & (kshape["CNTR_CODE"] == "DE")]

Next we merge our main dataset with its corresponding geographical shapes.

# Merge the datasets and turn them into a GeoPandas DataFrame
germany_df = pd.merge(df, kshape, left_on="NUTS", right_on="NUTS_ID", how="left")
germany_df = gpd.GeoDataFrame(germany_df)

Preparing Data for Visualization

We compute the centroid of the regions to determine the starting point for initializing our map. This helps in setting up a focused visualization:

# Compute centroids of the map
x_map = germany_df.centroid.x.mean()
y_map = germany_df.centroid.y.mean()

The percentages for bioeconomy shares are formatted to enhance readability:

# Compute percentages
germany_df['share_bioeconomy_percentage'] = germany_df['share_bioeconomy'].apply(lambda x: f'{x*100:.2f}%')
germany_df['share_hightech_bioeconomy_percentage'] = germany_df['share_hightech_bioeconomy'].apply(lambda x: f'{x*100:.2f}%')

Map Visualization with Folium

We use Folium to create an interactive map. We start by setting up the map centered around the calculated centroids and define a tile layer for the background of the map.

# Initialize the map object
mymap = folium.Map(location=[y_map, x_map], zoom_start=6.3, tiles=None)
folium.TileLayer('CartoDB positron', control=False).add_to(mymap)

Next, we create a GeoJsonTooltip for interactive data display, providing more context when hovering over regions. Let’s define these for each layer.

# Define the tooltips
tooltip_share_bioeconomy = GeoJsonTooltip(
fields=['NUTS_NAME', 'share_bioeconomy_percentage', "bioeconomy_firms","total_firms"],
aliases=['District/independent city: ', 'Share bioeconomy firms: ', "Number of bioeconomy firms:", "Total firms identified (count):"],
localize=True,
sticky=False,
smooth_factor=0,
labels=True,
style="""
background-color: #F0EFEF;
border: 2px solid black;
border-radius: 3px;
box-shadow: 3px;
font-size: 12px;
""",
max_width=750,
)

tooltip_hightech_share_among_bioeconomy = GeoJsonTooltip(
fields=['NUTS_NAME', 'hightech_bioeconomy_firms', 'share_hightech_bioeconomy_percentage', "bioeconomy_firms","total_firms"],
aliases=['District/independent city: ', 'Number of high-tech bioeconomy firms:', 'Share of high-tech firms in bioeconomy firms:', "Number of bioeconomy firms:","Total firms identified (count):"],
localize=True,
sticky=False,
smooth_factor=0,
labels=True,
style="""
background-color: #F0EFEF;
border: 2px solid black;
border-radius: 3px;
box-shadow: 3px;
font-size: 12px;
""",
max_width=750
)

Next, we add choropleth layers to represent different data dimensions — such as the share of bioeconomy firms and high-tech firms within the bioeconomy:

# Create and add choropleth layers
choropleth_share_bioeconomy = folium.Choropleth(
geo_data=germany_df, name='Share bioeconomy firms', data=germany_df,
columns=['NUTS', 'share_bioeconomy'], key_on='feature.properties.NUTS',
fill_color='Greens', bins=5, fill_opacity=0.6, line_opacity=0.2,
smooth_factor=0, use_jenks=True, overlay=False
).add_to(mymap)
choropleth_share_bioeconomy.geojson.add_child(tooltip_share_bioeconomy)

choropleth_hightech_share_among_bioeconomy = folium.Choropleth(
geo_data=germany_df, name='Share of high-tech firms in bioeconomy firms', data=germany_df,
columns=['NUTS', 'share_hightech_bioeconomy'], key_on='feature.properties.NUTS',
fill_color='Blues', bins=5, fill_opacity=0.6, line_opacity=0.2,
smooth_factor=0, use_jenks=True, overlay=False, show=False
).add_to(mymap)
choropleth_hightech_share_among_bioeconomy.geojson.add_child(tooltip_hightech_share_among_bioeconomy)

Now we are ready to add the tooltips to the respective choropleth objects.

# Add choropleth and tooltips
choropleth_share_bioeconomy.geojson.add_child(tooltip_share_bioeconomy)
choropleth_hightech_share_among_bioeconomy.geojson.add_child(tooltip_hightech_share_among_bioeconomy)

choropleth_share_bioeconomy.add_to(mymap)
choropleth_hightech_share_among_bioeconomy.add_to(mymap)

Now, we add a LayerControl to switch between both layers.

# Add LayerControl
folium.LayerControl(collapsed=False).add_to(mymap)

Finally, the map is saved as an HTML file, making it easy to share or embed elsewhere:

# Save map
mymap.save(add_your_path_here.html)

Conclusion

This example demonstrates the synergy between geospatial analysis and visualization tools in Python. By integrating geopandas and folium, we are able to create rich, interactive maps that can significantly enhance the presentation and understanding of complex datasets.

--

--

Lukas Kriesch

PostDoc in economic geography | NLP | Web mining | Spatial Data Science