Draw a map of the districts of Budapest using the Overpass API of OpenStreetMap and Python

Mor Kapronczay
HCLTech-Starschema Blog
6 min readMay 28, 2019

--

A map of Budapest. Source: https://hebstreits.com/product/budapest-hungary-downtown-vector-map/

Ever wondered how to draw a map of less common geographical areas? Perhaps even colour them based on some data? This is the first in a series of two tutorials that show you how to build this from scratch! First, you need to construct the border of your polygons — Part 1 is about this task. After that you need to create a map, and color those polygons according to some value of your interest. That will be shown in Part 2.

There are many tutorials on the internet for drawing maps in Python, even more sophisticated maps like heatmaps (where heat is basically the density of points in an area) or choropleth maps (where polygons on the map are colored according to some arbitrary value). It is important to note that while heatmaps are continous, choropleths assign a single value to a closed polygon unit.

However, these tutorials are mainly done on states of the United States (see some great ones with plot.ly or an also otherwise superb one with folium). For the US, these packages have some convenience methods, but they obviously do not for any area. Alternatively, they are based on some accidentally available JSON file like this one with Altair. Obviously, these are not general solutions to the problem of creating a map of some areas of your choice. I aim to give a general solution in these articles.

In this first part, you will

  1. get to know the basic structure of administrative areas in OpenStreetMap (OSM),
  2. how to use the Overpass API to automatically access OSM objects, and
  3. how to create a polygon representation of the coordinates of districts of Budapest.

The code is also available on GitHub:

The relation of the 1st district of Budapest. https://www.openstreetmap.org/relation/221984#map=15/47.4969/19.0375

OSM structures

In OpenStreetMap these administrative areas (in our case the districts of Budapest) are called relations. A relation is an object that consists of (outer) ways, which constitute the borders of the area of interest. These ways are constructed by nodes. These nodes are basically points on the map having a latitude and longitude coordinate, and potentially some other tags. For drawing the map, we need the coordinates of those nodes.

Access OSM objects automatically

The Overpass API is a REST API designed to use OSM directly, interacting with it using the HTTP protocol. The API has a specific query language and can be drived by http requests sent using the programming language or way of your choice.

The query below looks for a Hungarian relation, with the name “I. kerület” (1st district in English), and returns a JSON document that incorporates all the ways bordering our relation of interest.

Overpass query for I. kerület in Budapest.

You can search on OpenStreetMap to find the best way to locate your relations of interest. Taginfo can help you find the specific tags to achieve your results. Also, look at the documentation for more on Overpass queries.

Now that we have all the ways, we are able to access all the nodes constructing those ways, hence our borders for the districts of Budapest.

Create a polygon of nodes

We aim to have something like this at the end. Source: https://100ujgyulekezet.blog.hu/2013/05/29/budapest_xxiv_20_01_resz_a_xxiv_kerulet_helye

Creating a polygon basically means ordering your points in the correct order. You will download ways and nodes in a random order utilizing the API. If you ask any map service to draw a polygon of your nodes, it will be a mess. Drawing a geopolygon needs an ordered list of what constitutes it’s borders.

Finding the correct order is not a trivial task, unfortunately. If the polygon were convex the task would be much easier. In this case, one would simply be able to fit a convex hull on the points. You can think of that as the shape of a rubberband stretched around the points. Scipy’s ConvexHull does exactly that, and returns (some) points in the correct order. To show that this does not work in our case look at the picture below.

Convex hulls are not fit for the task…

A born-and-raised Budapest native would not be fooled and recognize the picture, but this definitely does not fit our needs. The failure of the convex hull approach is because shapes of districts are not necessarily convex. Therefore, another approach would be simply starting at some point and continue with the closest point according to the Euclidean distance metric. However, this will almost certainly lead to the omission of some nodes when they should be the next in line, as a consequence of concavity, thereby deforming the polygon.

How concavity messes with the “always add the closest” approach

This is shown in the figure next to this paragraph. The algorithm goes on with point C after point A, as it is closer to A, while it should go on with B obviously (for us). As a consequence, not only the area is a little deformed, but point B is added at the end to our polygon borders.

OK, no problem, let’s try out what is outlined here. The method aims to connect the dots without crossing. It succeedes with that, but does not necesserily find the exact shape of the districts. Actually it occasionally does.

But we do not give up, do we?

My solution

We have more or less linear ways, and we can make use of that — which we never did before! Concerning ways, there is virtually no chance of omitting nodes (because of their linearity), and the endpoint of ways should be part of precisely one other way as well. Let’s use that! The algorithm will be as follows:

  1. Select an arbitrary way.
  2. Find the first node by ordering coordinates based on primarily lat, and subordinately lon coordinates.
  3. Successively add the closest node on the way until there is no node left in the way.
  4. a) The last point should be part of precisely 1 other way. Continue with this way.
    b) This may not always be the case. Then find the closest node and continue with the way this particular closest node is a part of.
  5. Continue until there are no ways left.

The plot below shows what we ended up with. It is not perfect, nonetheless we created all this from scratch. In Part 2, we will explore how to create a choropleth map from our neatly ordered points using folium!

The resulting borders of the districts of Budapest. https://github.com/morkapronczay/osm_bp_districts/blob/master/base_layer_bp.png

Takeaways

The blogpost covered the basic structure of administrative areas on OpenStreetMap, automated access of them through the Overpass API, and most importantly, how to create a polygon from a bunch of points to draw it on a map. The most important takeaway here is that if there is a will, there is a way. Of course one can download shapefiles (if he’s able to find), or use this great functionality to create a map of an area of choice. But IMHO automated access always beats one where you have to click with your mouse even once. In Part 2 a beautiful and functional choropleth map is made using folium.

Part 2 is available here!

--

--

Mor Kapronczay
HCLTech-Starschema Blog

Machine Learning Team Lead, Chatbot Developer@K&H (KBC Bank Hungary)