# Buildings height estimation

Let us take you in a tour along our long journey through the Urban monitoring…

- What are we trying to do? Imagine you want to estimate the population in the unknown area…

If we look at the map, we can obviously calculate the area of the buildings, but cartographers have to spend a lot of time ti measure buildings heights by the shadow and “street view”. And if we want to know more about how much people live in the area — we need to know all: the buildings area, purpose and the number of storeys.

What we have proposed is an automated, **neural network powered tool to estimate all of the parameters**. The building segmentation and classification is a matter of a separate talk, so here we will try to explain how do we measure the height.

There are plenty of options that are long available for satellite data — radar images can directly show the object height, and the stereo-pairs (couples of images taken from different angles in known conditions) can show enough to see the building as a 3D-model, just like the 3D-movie.

However, we always challenge ourselves to make the same task with less (or cheaper) data. So the question is: can we find the heigth of a building in a single satellite image?

Of course, yes, as it was mentioned regarding cartographers work. And there is a number of papers describing such methods. The general algorithm is simple: given the location of the building footprint, we outline the shadow that the building casts on the ground and do simple math:

`H = S* tan(h)`

where **S** is the shadow length and **h** is the angle of the sun, which is known based on time of the image capture.

Is the problem solved then? Not yet. Apart from some areas not being mapped at all, it turns out that not all the satellite images are taken directly from above (in nadir), but many of them are taken at a slope (off-nadir). It is often necessary in order to capture the object of interest, but it makes it harder to discover the objects and their positions. In Spacenet round 4 competition the participants faced such problems, and struggled to achieve high accuracy in the footprint segmentation [3].

- In this case, the geometry of the scene can be updated:

Luckily, in most cases the angles — sun height and azimuth, and also satellite height and azimuth — can be obtained from the imagery vendors. And the best part is that if we will obtain the building roof, wall and shadow, we will be able to reconstruct all the 3D parameters, including height and the footprint.

That is important, because the footprint is not visible in the image, and its direct segmentation is a difficult task. We have formulated the problem so that our neural networks for segmentation do not need to guess the invisible parts, but only to outline the visible rooftop, shadow and wall.

Image segmentation for roof, shadow and footprint, as well as the segmentation of buildings, is a subject of another article. In fact, the approach in general is widely described in both science and media (see [1, 2], and we will also share our experience.

OK, now we have to measure the shadow and wall dimensions. But how? — a direct measurement will give much outliers, because some shadows connect to each other, and the wall is often partially obscured. Let’s simulate the building in 3D! That also solves naturally all the geometry problems. The modelling is also not our invention, but we have included the off-nadir part to the setup.

We start from the position of the roof and then try to guess: if the building has the given height, how good would the shadow and the wall fit it? By “fit” we mean that we calculate IoU between the segmentation results and the modelling results, and take the height that gives the best score (averaged for the wall and the shadow).

Let’s try our method. We have managed to get the test dataset with the aid of human cartographers who mapped over the given imagery (WorldView-2) and checked the building height using street view.

After all, the mean absolute error for our method is a bit greater than 3 meters, that means +- 1 floor error in usual apartment buildings. The results seem fair, but it is very hard to compare them to others, as the datasets are different. That’s why we will share ours as soon as the scientific paper is published! So, use it, build over it, and compare the results. The more we know — the easier the life should be.

*Dataset for validation of buildings heights can be found among our open-datasets: **https://github.com/aeronetlab/open-datasets** — follow us on Github.*

**References**