How to bring geospatial data on a map with Python — Reloaded

Or the incredible search for the jack of all trades visualization framework.

Andreas Hopfgartner
The Startup
4 min readDec 18, 2019

--

Photo by Марьян Блан | @marjanblan on Unsplash

In my last blog post on mapping sensor data, I used the Python package Folium to visualize sensor data like the position or some environmental values on a map. A lot has changed for me since then…

As I’m working with big data most of the time and I not only needed a interactive but static visualization (data is pre-aggregated and then the results are fed into a visualization system). I have the need to interactively explore the data. All the data, at best without constantly recalculating aggregations and doing the mapping, which can be a slow and tedious task. So I was constantly on the scout for new tools as Folium is only suitable for lets say a few hundred points.

Additionally, I have various use cases. Analyzing moving data of transport vessels, passengers, people, I need to plot a lot of data, sometimes as Graph visualization. Imagine moving things interact with their environment, which makes map visualization complex. Displaying a real-time state of a sensor or telemetric data, I need the possibility to dynamically update a plot from streaming data. My list of requirements is endless.

And to remain modest :-), I need a tool that creates impressive and rapid prototyped dashboards for the non-technical user (to bridge the data science gap). When presenting dashboards to my customers they always come along with their extra idea (always the one that is most of times not available as method in the used visualization framework). So I add “custom callback function” and “customizable by JavaScript” on my wishlist.

For visualization I prefer a declarative way of defining things. The idea is to define how my visualization should look like not depending on the data itself. It’s making things more transportable. I have to admit that personally I hate writing plotting functions with Matplotlib.

A few months ago I started comparing Bokeh and Plotly for my use cases. Plotly looked promising for a few of my purposes of application. But Scattermaps were only possible using Mapbox including generating a Mapbox token. Customizing seemed to be a hard task.

Bokeh looked nice, too, had all the functionality I was looking for, but had a lot of code overhead.

In the end I came across Holoviews and Geoviews, which are part of the PyViz family. Pyviz.org is a project that tries to consolidate the plethora of visualization tools on Python and gives a framework with low-, medium- and high-level API for designing (interactive) visualizations up to building beautiful interactive web applications to explore data.

At first, this framework is more or less “tool agnostic”. It allows to declaratively define a plot and globally specify the extension that should be used (at the point as I’m writing this, Bokeh, Plotly, Matplotlib and Altair can be used as extension). I’m thrilled how little code you need to create a nice graphic.

The lowest-level API is using Bokeh. The mid-level API is Holoviews (and Geoviews adding map functionality). The high-level API is Panel, which can be used to build desirable interactive dashboards (in the background a Bokeh server is instantiated, which creates data binding between Python and the JavaScript frontend code).

Having big data, Datashader is a great and easy to use extension that makes it easy and incredibly fast to plot billions of points on a map. Basically it’s just placing the datashade() function around a Holoviews visualization object.

I assume you know Panda’s .plot() method . Hvplot extends Pandas with Holoviews and Geoviews functionality using the .hvplot()method with the same look and feel.

When starting with Folium, one of my use cases was to display real-time sensor data on a map. I had roughly 100k data points to plot on a given day, popping up and disappering in a Kafka stream. Of course this was not possible with Folium. With the newly discovered framework I have resumed this task and was almost finished after an hour of work.

What really rendered me marveled is the fact that I could take the code as it is and wrap it inside a Flask app. My process from ideation to having my idea on screen incredibly sped up. In all of my geospatial analytics project, the visual exploration is the first and the last step of building machine learning models.

Summarizing, surely it makes sense to use tools like Graphana (telemetry data), or Tableau and PowerBI (business intelligence). But these are different approaches to different questions. For the ideation process up to beautiful dashboards that stay completely in the Python ecosystem, PyViz is my way to go. For more abstract pipelines I still keep the option in mind to use some BI tool (and the additional work to do ETL to an application database).

--

--

Andreas Hopfgartner
The Startup

Working as Cloud Solution Architect for Data & AI and also in the realm of Internet of Things for Microsoft in Germany.