DATA STORIES | 3D MAP VISUALIZATION | KNIME ANALYTICS PLATFORM

KNIME & GPX: how to build a 3D map with mapbox.com

Using no code tools to process massive geoloc data

Diego Romero

Published in

Low Code for Data Science

8 min readOct 18, 2021

As first published on Restless Engineer

If you have read my previous article, you know that I love visualizing and working with geolocated data. What could be the reason? Maybe my natural predisposition to travel and explore. Or is it easier to obtain data of this type? I have no idea.

Anyway, lately I have been exploring how to create a 3D map to visualize information from routes in GPX format. There is more and more geopositioned information, and being able to visualize it quickly and easily while having it embedded in a web page is the best way to analyze it. But how to approach the whole process without programming?

The initial premise is simple: geolocalized information available to be processed with no-code tools, and an easy way to visualize it that can be integrated into a WordPress-based website.

Challenge

As you may know, I have a special interest in the Camino de Santiago. Combining engineering and the Camino gives me the chance to help my fellow pilgrims by showing them data that would otherwise be difficult to obtain. Or just looking at it in a different way.

Objective: flying over the peaks of the Pyrenees along the first stages of the Camino de Santiago through Aragon.

In this case, the challenge, based on the previous premise, was as follows:

Using public information from the CNIG about all available routes of the Camino de Santiago in the Iberian Peninsula, visualize them embedded in an article of my web page.
Provide extra added value to GPS positions. I was almost certain that the files would have some elevation inconsistency. This is one of the most interesting features for pilgrims who have to carry a backpack.
Use no-code tools and share it in a way that anyone could replicate and improve it. I am not an expert programmer and it is a personal project. I am more interested in the technical viability than the perfection of the result.

Starting data

As I said before, the starting data is all the GPX files available on the CNIG portal (Spanish National Center of Geographic Information). This information has been provided by the Federación Española de Asociaciones de Amigos del Camino de Santiago (FEAACS). Data is available in both GPX and KML formats. I chose GPX because I found it easier to extract the data to an intermediate JSON file.

After downloading each of the 1,015 individual files corresponding to all the stages of all the variants of all the Caminos, I unified them into a single database of GPS points using the open source, no-code tool for data science par excellence: KNIME Analytics Platform.

Here is the detail of the GPX parser workflow that I implemented with KNIME.

GPX Parser component to encapsulate all the logic.

First, I ingested all files using the List Files/Folders node. Then, I extracted the path information from all GPX files, read them as text-based files and convert them to XML. Next, I converted them to JSON (I am more comfortable processing JSON than XPaths) to prepare the strings for the extraction of information about the routes and stages: the RegEx Extractor, String Manipulation and String to Number nodes.

I am sure that more advanced KNIME users will find a better way of doing this. Leave a comment and give me the opportunity to learn :-).

During this ETL process, I noticed that some files were missing or even data within some of those files. This is when a tool like KNIME helps you tremendously to blend data from different sources. In any case, I already had 398,000 GPS points to begin with.

Missing elevation data

During the data blending process, I realized that there were stages and caminos that did not have elevation information. To be honest, extracting just the GPS information and displaying it didn’t seem sufficient but I also didn’t want to incur the expense of an API like Google’s to get the elevation.

After some research, I found an API service (open-elevation.com) that I could use to automate those queries and get the data. By being careful with the calls, I was able to download the information for over 105,000 points that had previously wrong or missing elevation information in less than 4 minutes. The steps would be as follows:

Identify points that have no elevation data or have an error.
Filter this data to generate the JSON that I will use as a request body in a POST call to the open-elevation.com service.
Generate a loop to request a maximum of 1,000 points per call. Be kind my friend.
Generate the JSON with the latitude and longitude data in the required format. The nodes in this case would be: the Columns to JSON and JSON Row Combiner.
Now we have a JSON table with the arguments for the API POST. In this way, we will only need to use a POST Request node with a delay of 2,000 ms.

knime workflow open-elevation.com — *Workflow for GPS position elevation requests.*

Displaying elevation information visually

Before moving on to the next challenge, i.e. integrating mapbox.com service, let’s check that the data obtained is, at least, consistent.

To do this, first I unified the 398,000 points and used KNIME’s quick display capabilities. I tested both the Wikimedia Map and Bing Map (OSM Map View node).

open streen map and bing visualization of elevation data — *Elevation data visualized on Bing Map using the OSM Map View node.*

If we look at areas such as Cruz de Ferro (León) or the areas of Picos de Europa, we can clearly see how the tonality changes, indicating greater terrain elevation.

Checked… so far so good.

Mapbox (3D map): connecting to the service

After processing and cleaning the data, I used the Mapbox’s map service to display all the information with its 3D engine. AGILE mode: ON. Let’s start by importing all the points in the Mapbox editor (Mapbox Studio) and check if we can integrate it in a web article.

Create a new style with the Terrain 3D display to show the information of the points that we processed.
Create a new tileset by importing the generated CSV file with the following information (Table Manipulator node): camino ID, stage ID, description of the Camino, description of the stage, latitude, longitude and elevation.
We associate the tileset with the style and then adjust the display according to the height. Here is a screenshot.

mapbox screenshot import tileset — *Adjusting the display of data in Mapbox Studio.*

Now we just have to publish it and embed it in our website. After searching for how to do it, the truth is that it was simpler than it seemed. Here you have the code to integrate it in your WordPress page or post.

CSS, mapbox integration and zoom/rotation controls.

This is the only point where I added some code to improve the user interface by showing a pop-up with the information of the point when you click on any of them. All you have to do is insert your own [accessToken], [user] and [style].

Yes, you are right, I had to include some JavaScript code in a no-code. My fault. It won’t happen again.

Result

You can see the result (and play with it) on my Camino de Santiago website: All the Caminos in 3D.

Just an example of how can you explore the data in an interactive way.

Here is also the complete workflow so that you can replicate it. I uploaded it on the KNIME Hub for you to download it for free.

KNIME workflow for processing the GPX files of all the Pilgrim Roads of Santiago.

Summary

Database of GPX files with the stages of the National Center of Geographic Information (CNIG). We can never be grateful enough to the Spanish Federation of Associations of Friends of the Camino de Santiago (FEAACS) for the work of dissemination and compilation that they do. It is absolutely fundamental to build all Caminos in 3D.
KNIME Analytics Platform. As you can see, I have used it mainly for the process of ingesting, transforming and updating data before uploading it to data visualization services. On this occasion, the focus was on updating the elevation data to generate the profiles of the selected stages, variants and Caminos.
Open Elevation API. As they describe it themselves: “Open-Elevation is a free, open source alternative to the Google Elevation API and other similar offerings.” Same #nocode philosophy: access to your API service with POST nodes and JSON Row Combiner to generate the request for those track points that had no associated elevation data. Really easy.
Mapbox. Possibly one of the best map services that already in its free version offers a lot of functionalities. Their Mapbox Studio suite is a small wonder for generating maps. It also enables 3D visualization with an impressive degree of detail. To load the data I used a simple CSV file with the information of the points.

When new GPX files become available, I already have a workflow for the automated processing and increased map coverage.

Future Improvements

Here are some improvements that I would like to work on with a little more time:

Include the modules to do the loading and updating in the system automatically using the API provided by Mapbox.
Automatically connect to the download area of the Spanish Federation of Associations of Friends of the Camino de Santiago (FEAACS) to increase the number of supported routes: the European ones.

Leave me a comment if you would like me to include information that may be of your interest or to explain a point that may not be clear.