The urban morphology on our planet — Global perspectives from space
Dr. Tobias Beuchert is Science Manager in the Department for EO Data Science at the Remote Sensing Technology Institute (IMF), in the Earth Observation Center of the German Aerospace Center (DLR)
Urbanization can be seen as the second-largest global megatrend right after Climate Change. According to the UN, approximately three-fourths of the worlds’ population will live in urban areas by 2050, while one-third will occupy large cities above 500,000 citizens already by 2030. This is no surprise. Of course, the urban space provides jobs, education, dense social networks, diversity, or numerous cultural offers. What sounds like advantages, often leads to numerous disadvantages that citizens tend to face in rural areas nowadays. The extreme urbanization rates, however, cause challenges to the society growing in cities. Meeting the needs of an increasingly diverse population can quickly be challenged by complexly interconnected challenges of the socio-economic domain or the Earth-system itself. For example, our work  shows that about 60% of the global urban population lives in settlements of compact forms, including lightweight/large low-rise structures, a subset of which can be informal settlements. These populations only make up about 40% of the entire mapped urban space. In contrast, only around 10% live in more sparse settlements that make up approximately 30% of the global urban space. Furthermore, the densest settlements are prone to turn into heat islands when considering the unbroken global warming . Are urban spaces indeed spaces of increased wellbeing when considering these insights?
In order to tackle these social inequalities in urban spaces and in order to engage in the UN Call for “Sustainable Cities and Communities”, researchers, urban planners, and policymakers depend on a solid data basis. For a fact, there is still a lack of sufficient and reliable geoinformation on the distribution, pattern, evolution, and dynamics of the built environment covering the entire planet. While global databases do exist, they are, however, still too heterogeneous, do not resolve the intra-urban morphology and evolution globally, and often lack sufficient quality to be exploited in research. Tackling these shortcomings sounds straightforward at first. Why not exploiting the extensive and free data archive of ESA’s Copernicus fleet of Sentinel satellites to regularly monitor the entire planet’s surface and identify different types of urban settlements? Here, the true challenge is the huge data volumes of tens of Petabytes and the complexity of the models needed to interpret and classify these data. On top, the computational time that would be needed to classify the small-scale built environment on data at a global scale is immense. And ultimately, if we are seeking a general description of the urban spaces across the globe, this model needs to be independent from cultural differences.
Our research group, which is forming Europe’s largest competence center for Artificial Intelligence (AI) in Earth Observations, is able to tackle these challenges. With AI, we can model and classify these complex global data in a reasonable amount of time and even on an intra-urban basis (see Fig. 1). AI further allows us to exploit and fuse the complementary information contained in the different data sources, for example the fleet of Sentinel satellites. One and the same area is thus observed at different wavelengths, i.e., in visible light (Sentinel-2) or using radio waves (Sentinel-1), or at different times. On top, a single observed shot (or image) taken in the visible light can further be spectrally decomposed into numerous individual “sub-images” that show the surface at tens of different wavelengths from the visible to the near infrared light. The surface can appear very different at each of these wavelengths; the additional information per single observation can therefore be huge. In short — without AI, one would have a hard time handling the diversity of these data and facilitate their synergies at a global scale.
But how do we classify EO data using AI in practice? And how does AI work in the first place? An AI algorithm is fed with input data, in our case a huge number of images, and returns a result, in our case patterns or structures within these images. Of course, the algorithm needs to know what to look for. For that reason, human researchers provide so-called labeled training data. In our case, these are images of urban spaces with annotated distinct regions like for example compact / high-rise buildings or vegetated areas. The better the detail and amount of such pre-labeled and segmented satellite images, the better the performance of AI to do the same labeling and segmenting task for all large cities worldwide. It is therefore a key and truly time-consuming task for researchers to provide the algorithm with the best-possible selection of data and information to learn from. Our team has invested a large effort to do so. 15 domain experts manually assigned labels for more than 400,000 Sentinel-1 and Sentinel-2 image patches that are distributed over 42 different urban areas in many different cultural zones and geographic regions on the globe. We conducted a rigorous quantitative evaluation of 10 cities in this dataset by having a group of remote sensing experts cast 10 independent votes on each labeled polygon, in order to identify possible errors and assess the human labeling accuracy. Statistics show that our human labels achieve 85% confidence. This confidence number can serve as a reference accuracy for the machine learning models trained on this dataset. These geographically and culturally well-balanced global reference data allow us to design a sophisticated deep learning algorithm and apply rigorous evaluation procedures. We developed a ResNet-based multi-model fusion architecture allowing one Sentinel-1 image and four seasonal Sentinel-2 images of each city as the inputs to predict LCZ classes. Our algorithms were able to resolve and classify intra-urban substructures of all cities worldwide with more than 300,000 citizens according to the UN, covering a total of 1.65 Mio km2. To validate the model performance, we consider three training-testing splits: “random-split”, “block-split” and “cultural-10”. Among them, random-split indicates a spatially random sampling of evaluation points, which is commonly used in remote sensing. Since the distributions of training and test data are similar, random-split defines the upper bound of achievable classification accuracy. In contrast, block-split is a deterministic data split, i.e., the data from each city is separated into nonoverlapping east-west blocks and the accuracy is evaluated on unseen blocks. The block-split gives a representative measure of accuracy for unseen cities whose data distribution is similar to the training cities. Last but not least, cultural-10 defines the lower bound of the achievable accuracy by evaluating completely held-out data in a cross-validation scheme. Evaluation shows our classification accuracy ranges from 51% to 83%.
The result is the first uniform and quality-controlled global local climate zone (LCZ) classification with 17 different classes as shown in Fig. 2. This unique dataset was named So2Sat GUL (Global Urban LCZs), which can be accessed from this link.
And this is not the final word. After training our model with AI, we were also able to reproduce typical geographic-cultural differences between city morphologies. And we can show that on average and at first order, the intra-urban morphology of Central-European cities, African cities, or cities of the Islamic world appear self-similar in the model output. Yet, the situation is way more complex. At a second-order, cities worldwide are quite diverse. Models that too strictly depend on distinct geographic-cultural differences, would not consider the complexity of urban spaces and global interconnections.
Due to the open access policy of our So2Sat GUL (global urban LCZs) data product, we anticipate a multidisciplinary community from fundamental research, via urban planners and policymakers to institutions like the UN, to make use of these data. These stakeholders can clearly benefit from the first-time-ever information on the intra-urban morphologies for the majority of cities across the globe. Global urban databases like the one presented here are inevitable for paving the way towards more resilient and sustainable urban spaces with increased wellbeing among their citizens. Informal settlements, for example, can be quickly geolocalized. The structure and distribution of buildings (low-rise and sparsely distributed or high-rise and closely packed) and also inherent population densities represent critical information; especially during natural catastrophes like floods, there is a need for sufficient geoinformation in order to protect the most disadvantaged urban spaces. The database even allows to localize potential heat islands and assess parameters like habitability. Given that the upward trend of global warming and secondary effects is still unbroken, reliable information on the status quo of the built environment are highly valuable for urban planners and policymakers today in order to mitigate and counter-act mass-migration in the (near) future.
By Dr. Tobias Beuchert
 X. X. Zhu et al., “The urban morphology on our planet–Global perspectives from space,” Remote Sens. Environ., vol. 269, p. 112794, 2022.
 S. Chapman, J. E. Watson, A. Salazar, M. Thatcher, and C. A. McAlpine, “The impact of urbanization and climate change on urban temperatures: a systematic review,” Landsc. Ecol., vol. 32, no. 10, pp. 1921–1935, 2017.