Integrating Satellite Image Analysis into Urban Planning GIS

16 min readNov 6, 2021

Overview

Building cities that ‘work’ — inclusive, healthy, resilient, and sustainable — demands evidence-informed decision-making where government policy is formulated and evaluated against ever-changing needs and requirements of its citizens. Local authorities are obliged by UK central government to provide accurate annual statistics related to availability of housing including lettings, waiting lists, vacant properties, condition, and new supply. Combined with commercial development pipeline, this information reveals current and projected demand for essential services such as education, healthcare, public transport, greenspaces across different regions — as well as key economic indicators such as local job availability and social mobility.

Due to logistical complexity of managing and collating information across multiple departments and offices across large cities such as London, a disconnect often exists between information held on planning and development systems and the actual real-time status of construction projects. Local Planning Authorities often experience delays in updating their backend systems due to workload and limited labour resource. Larger-scale projects are often covered by multiple planning orders — it is therefore often difficult to track whether approved permissions for a specific scheme have lapsed or have been superseded by new applications.

Satellite remote sensing offers unique source of situational awareness to assist sustainable urban development — from high altitude, it is possible to identify and monitor change synoptically across entire cityscapes at regular intervals. Current generation of Earth Observation satellites offer unprecedented insights into urban fabric — including structural health monitoring of key infrastructure (bridges, roads, railways) and quantitative assessment of environmental health risks such as air pollution and heat island effects.

This article evaluates a prospective solution for seamlessly integrating satellite image analysis into existing open-GIS desktop environments. A PostgreSQL dump of London Development Database (LDD) was selected as a representative test environment.

Screenshot of London Development Database Web Portal

Time series of high resolution optical satellite images (~10TB) acquired over London area was subsequently registered as tiled, out-of-database PostGIS-Raster objects and partitioned into TimescaleDB hypertables. Testing focussed on executing SQL queries to extract and visualise collections of time-stamped PostGIS-Raster objects collocated with planning permission locations within QGIS 3.14 desktop environment. Use case analysis demonstrated feasibility of leveraging satellite image analysis to corroborate status of individual planning permission records against two key milestones: commencement of works and the completion of works.

Commencement of Works

Commencement of works is a key milestone in construction project life cycle — triggers liability for Civil Infrastructure Levy (CIL) payments, ‘crystallises’ the permission, informs decision-making and planning of key stakeholders — local government, infrastructure, utilities, planning offices, etc. By law, any approved planning permission lapses after a certain period — generally three years from date of approval.

With reference to Town and Country Planning Act 1990, Section 56[1], development is considered to have commenced ‘…on the earliest date that a material operation in connection with the development is started’, that is:

Any work of construction involving the erection of a building.
Any work of demolition of a building.
Digging of a trench to contain foundations, or part of the foundations, of a building.
Laying of any underground main or pipe to the foundations, or part of the foundations, of a building or to any such trench as described above.
Any operation involving laying out or constructing a road or part of a road.
Any change in the use of any land which constitutes material development.

Due to irregularity of site visits by planning officers to clarify scheme status, permission commencement date recorded on LPA backend systems may not accurately reflect physical conditions on-site. It is proposed that visual evidence from satellite imagery — land clearance, laying of foundations, arrival of heavy machinery, etc. — may provide reliable insight into scheme commencement date for legal and planning purposes.

Completion of Works

Completion of works is a crucial milestone in construction project life cycle. Building contracts typically stipulate a completion date when client take possession of site — developers are liable for damages if deadlines are exceeded. Completion date notifies the release and forthcoming occupancy of new residual units — informs planning and policy making across many government departments including local taxation. Local planning offices have a statutory requirement to submit up-to-date and accurate housing returns to central government — informed by completion dates held on planning systems. It was envisaged that visual evidence extracted from high resolution satellite imagery — road markings, cars on driveways, removal of heavy machinery, landscaping — may be utilised to infer and / or confirm likely completion dates and levels of building occupancy.

Technical Requirements

A key objective of this study focused on development of a open GIS framework to facilitate creation, analysis and visualisation of satellite image ‘case files’ — temporal stacks of georeferenced and time stamped satellite images logging visual evidence of construction activity at permission-approved sites across London. By interactively inspecting imagery on a frame-by-frame basis, planning officers may utilise case files to evaluate on-site construction activity on specific dates and query / confirm scheme status. To maximise its technical and commercial viability and prospective benefits, a prototype framework was designed in accordance with following non-functional requirements:

Scalability: Geospatial data is the original ‘Big Data’ — ever-increasing quantities of near real-time satellite imagery is routinely acquired and stored in petabyte-scale data warehouses. Regulating impact and sustainability of urban planning policy across greater London is a highly complex and expansive issue. It is therefore essential that technologies underpinning prototype software framework implement the capacity to scale — both in terms of data storage and workflow performance.

Extensibility: Workflow processes generally adapt and evolve as new technologies and software platforms are adopted in the workplace. Adopting a loose and decoupled approach to system design is of paramount importance to cater for requirements and needs of individual end users. It is therefore prudent to prototype system capabilities on top of tried and tested standalone functional components implementing generalised and well-established application programming interfaces (APIs).

Sustainability: In the UK — and across the world — public institutions are now actively encouraging adoption of open-source technologies and open data policies to minimise operational cost and prevent vendor lock-in. Open-source software projects provide high quality engineering baseline for feature customisation and system design — as well as facilitating a collaborative, community-based approach to problem solving and future development. Acquiring control of underlying source code provides full visibility of system functionality — allowing stakeholders to rapidly address limitations and plan future development.

Interoperability: Geospatial data is extremely diverse — for example, spatial features vary from simple points to complex three-dimensional polygons registered to a location based on name, geographic coordinates, administrative distinct, etc. Standardisation of content (data formats) and interfaces (APIs) is essential for intra-system integration and compatibility. Developing tools and information services founded on common standards reduces fragmentation and inefficiencies when sharing geospatial information between stakeholders. Organisations such as Open Geospatial Consortium (OGC) and Open-Source Geospatial Foundation (OSGeo) promote standardisation of data formats and APIs within open-source geospatial community.

Usability: To minimise operational costs of system deployment and upskill training, it is optimal — where possible — for new decision support services to seamlessly integrate with existing workflow systems and processes. It is therefore prudent to build new services around technologies and platforms already known and leveraged by stakeholders.

Security: All information held on existing planning systems is highly sensitive — it is essential software services accessing planning records are reliable and highly resilience to cyber threats such as hacking and malware.

Technical Design

An end-to-end Desktop GIS- based support service — encompassing acquisition, management, analysis, and visualisation of satellite image time series data — was rapidly prototyped to demonstrate a prospective delivery mechanism for integrating satellite image analytics into existing urban planning operations. All software technologies underpinning the prototype GIS- based support service fall under the umbrella of Open-Source Geospatial Foundation (OSGeo) — a non-profit organiszation promoting open and collaborative working practices, encompassing interoperability / community standards through to networking opportunities and outreach initiatives.

Desktop GIS applications typically adopt a three-tier client-server architecture where Presentation, Application Logic and Data Management functions are implemented as standalone components. The client/server model is suitable in many-to-one scenarios, where the information and the services of interest are centralized and accessed through a single access point. In general, multiple clients are interested in such services and the server must be appropriately designed to efficiently serve requests coming from different clients.

High level Overview of Three-tier Client / Server Architecture for Desktop GIS Applications

The Presentation tier represents the front-end layer and comprises user interface — often graphical — to facilitate processing and analysis of application-specific datasets. The Data Management tier incorporates persistence mechanisms (database servers, file shares, cloud-storage buckets, etc) and API layer for managing access to stored information. Finally, the Application Logic — aka middleware — tier coordinates the core application and drives its capabilities — it also manages exchange of information between Presentation and Data Management tiers.

Data Management Tier

PostgreSQL is an open-source relational database management system (DBMS). It is often used as a free alternative to proprietary database products. Due to the cost and complexity of licenses to install and maintain commercial databases, many public agencies have decided to migrate operational services to PostgreSQL.

PostgreSQL is vertically and horizontally scalable — it supports parallelised query plans by default, extensions such as CitusDB allow PostgreSQL to function as a clustered database with data tables and query execution distributed across multiple nodes. Due to its open-source nature, PostgreSQL is highly extensible — it is possible to implement custom data types and functions server-side to improve query logic and support new capabilities.

By applying functionality of PostGIS, standard PostgreSQL databases may be adapted to realise the capabilities of a spatial database management system. The PostGIS extension adds spatial objects, indexing and a feature-rich suite of spatial processing and analytic operators to core PostgreSQL. To maximise its interoperability and functional firepower, PostGIS leverages specialised capabilities of several open-source geospatial libraries including GDAL, PROJ, GEOS.

By encoding geospatial datasets as database objects, PostGIS provides a storage and analytics environment to seamlessly operate on GIS vector and raster data types — for example, clipping imagery that intersects vector boundaries. For raster-centric use cases, PostGIS provides an extensive suite of management functions to ingest, create, modify, and export image datasets — leveraging core capabilities of the GDAL library. PostGIS also supports an extensive armoury of raster processing functions implementing spatial aggregation, statistical analysis, advanced map algebra — on top of rasterization and vectorisation capabilities.

PostGIS supports both in-database storage and out-of-database storage of raster datasets — see CrunchyData blog post for further details. Whilst internally stored raster objects offer some advantages in terms of querying performance, geospatial image datasets are typically many gigabytes in size. Loading large collections of imagery into PostgreSQL data tables involves significant overhead — as well as increasing storage costs through data duplication. Very large data tables typically introduce additional performance related maintenance issues related to maintenance — for example, time taken to rebuild indexes and complete backups.

Out-of-database raster objects are encoded as data records comprising only basic metadata (geographic coverage, data type, number of bands) and path location to the original dataset. PostGIS utilises this information — combined with functionality of GDAL — to seamlessly import subsets of image data from external file sources during querying and processing. Out-of-database storage offers the most practical solution for indexing a large collection of high-resolution satellite images — due to the very large file sizes of original datasets.

To reduce complexity of managing and querying time series data, PostgreSQL / PostGIS databases were further extended with functionality of TimescaleDB. TimescaleDB automatically partitions large time series across multiple data tables for optimal performance — partitioned data tables are accessible as a unified collection via a single virtual view — aka a hypertable.

Optimised transfer of information from Data Management tier to Application Logic and Presentation tiers was also a key design consideration — especially given multi-gigabyte file sizes of high-resolution satellite imagery. Since version 3.1, GDAL supports a driver to create Cloud Optimised GeoTIFF (COG) images — a GeoTIFF-derived format with internally tiled structure where byte offset, and relative pixel location of individual blocks are recorded within the metadata of the extended TIF file header. Parsing COG header provides sufficient information for the GDAL driver to extract sub-windows of intra-file content pertaining to nominated areas of interest, allowing optimised streaming and progressive rendering of large raster datasets.

GDAL Virtualised File System API provides the second piece of jigsaw — leveraging the multiprotocol networking capabilities of cURL library, GDAL provides seamless access to imagery hosted on private buckets of commercial cloud storage — including Amazon Web Services, Google Cloud Platform and Microsoft Azure. When reading remotely hosted COG images, the GDAL driver utilises HTTP / FTP range requests to parse file headers on-the-fly and retrieve subsets of server-side content.

HTTP / FTP range requests where selected portions of server-side content are
requested and forwarded onto clients

The COG format includes optional support for overviews — reduced resolution down-sampled versions of original image — whose content is similarly tiled and encoded as byte offsets in header metadata. Overviews allow COG-aware software to retrieve subsets of image content at specific zoom levels — rather than downloading full resolution imagery and resampling client-side.

PostGIS inherits streaming capabilities of the GDAL Cloud Optimised GeoTIFF driver — it is therefore now possible to create a novel spatiotemporal data management and analytics workflow where satellite imagery hosted on cloud-storage buckets is seamlessly loaded and accessed as a collection of out-of-database raster objects.

Visualisation of PostgreSQL-driven spatiotemporal database where cloud-hosted satellite imagery loaded and accessed as out-of-database raster objects

Application Logic Tier

The Data Management tier supports a powerful SQL-based query engine enabling clients to filter, process and retrieve spatiotemporal objects stored server-side in PostgreSQL data tables. The Application Logic tier was therefore created as a suite of software tools and APIs equipped with functionality to translate high level user requirements — for example, return all satellite imagery acquired in 2019 collocated with a specific point geometry — into custom SQL-based commands. Time-indexed PostGIS spatial objects returned by the server after successfully executing a client request are forwarded to Presentation tier for visualisation purposes. Application Logic tier may also incorporate additional post-processing functionality — for example, GDAL implements drivers to export PostGIS spatial objects to vector and raster file formats.

Tasks forwarded to PostgreSQL server from the Application Logic tier may be relayed interactively by the end user — or programmatically within the scope of a manually driven or automated software process leveraging PostgreSQL client API — for example, Python library psycopg2.

Presentation Tier

QGIS is a free and open-source cross-platform desktop geographic information system (GIS) application that supports concurrent viewing, editing, and analysis of geospatial data. QGIS provides a graphical environment for performing standard GIS workflows — including aggregation of multiple geospatial data layers and publication of georeferenced maps. QGIS inherits core capabilities of GDAL and GEOS libraries — functionality to import / export geospatial file formats and advanced operations to process and analyse vector geometries / raster images. QGIS also provides a client-side interface to PostgreSQL database server — PostGIS spatial objects returned by SQL command are seamlessly imported into QGIS analytics and visualisation environment as vector and raster data layers.

QGIS also supports a Python API where QGIS specific resources and functionality may be imported into standalone scripts and custom applications — for example, automated printing of map views comprising superimposition of raster and vector layers.

Technologies underpinning architectural tiers of Desktop-GIS prototype framework

Data Resources

Pléiades and SPOT Constellation

To reduce data procurement costs for prototype development and feasibility analysis, the study negotiated access to archives of satellite imagery procured by the UK Space Agency as part of the Space for Smarter Government Programme (SSGP). Assembled to trial Earth Observation driven services within the UK Public Sector, the SSGP archive comprises high resolution optical (Pléiades and SPOT) covering the UK from 2017 onwards.

All datasets were provided as Standard Ortho Bundle products where the original imagery had been orthorectified during post-processing to reduce the effects of image perspective and terrain — on top of corrections for radiometric and sensor distortion. Utilising functionality of the Orfeo Toolbox (OTB), a Python workflow was implemented to automatically generate pansharpened versions of original Pleiades and SPOT multispectral images using co-registered panchromatic band.

Original Pléiades Multispectral Imagery at 2.5m pansharpened to 50cm resolution

Maxar SecureWatch

This study also obtained limited access to very high-resolution optical imagery captured by the Maxar constellation of Earth Observation satellites — namely WorldView 1–4 and GeoEye-1. The Maxar constellation collects more than 1 billion km2 of Earth imagery per year at sub-metre spatial resolution. Maxar operate on-demand subscription-based services to its image library — SecureWatch provides API, streaming and download access to Digital Globe image archive as well as latest imagery acquired by WorldView constellation with a latency of 48 hours.

A software tool was developed to rapidly query the entire Digital Globe catalogue and identify archived datasets satisfying customisable constraints based on area of interest, product type, cloud cover and acquisition date. All datasets hosted in the Maxar image library are accessible as OGC Web Mapping Tile Service (WMTS) layers — enabling small, highly targeted subsets of very high resolution imagery to be downloaded as a series of 256x256 PNG tiles at a zoom level specified by end user. Once the download is complete, individual tiles are collated and warped into a single georeferenced Cloud Optimised GeoTIFF image utilising GDAL API.

Using the Google Cloud Client Python API, Cloud Optimised GeoTIFF files were programmatically uploaded to a Google Cloud storage bucket and organised into a sub-directory structure ordered by platform NORAD catalogue number and acquisition datetime.

Data Ingestion

Using the raster2pgsql command line tool, over 3TB of London-centric Cloud Optimised GeoTIFF imagery (SPOT, Pleiades and Maxar) was successfully loaded into PostgreSQL data tables as collections of out-of-database PostGIS Raster objects.

The raster2pgsql loader generates a sequence of SQL commands to import the content of GDAL supported images into a nominated data table. To optimise spatial indexing and data table management, raster2pgsql optionally supports tiling where equally sized blocks of image pixel / metadata information are inserted as separate rows in a PostGIS data table. To ensure alignment between PostGIS raster objects and internal COG structure / metadata, raster2pgsql loader was configured to register cloud-hosted images as out-of-database objects with an equivalent 256x256 pixel tiling size.

Loading of cloud-hosted COG images into the PostgreSQL database was executed programmatically within the scope of an automated Python workflow. Out-of-database metadata records generated by raster2pgsql loader — augmented with full pathname of underlying COG file and corresponding acquisition timestamp — were inserted into product-specific TimescaleDB hypertables and partitioned into 7-day chunks based on the values of the acquisition timestamp.

To optimise loading times, multiple images were registered simultaneously as out-of-database raster objects by executing raster2pgsql driven workflow across multiple threads / client connections. Upon successful registration, Google Cloud-hosted image repositories were made seamlessly accessible via to PostgreSQL server, allowing filtering, processing, and analysis of underlying pixel information with the highly optimised PostGIS Raster SQL API.

Use Case Analysis

Feasibility of visually tracking construction activity from space — in particular, corroborating commencement and completion dates — was evaluated for randomly selected subset of permission records extracted from SQL dump of the London Development Database. For each permission record, SQL views were created of out-of-database raster tiles intersecting a 300m bounding box centred on its underlying POINT geometry and visualised in QGIS as a date ordered collection of raster data layers.

Screenshots of selected use cases are presented below — please click on image to see an animation of construction project life cycle visualised as time-ordered PostGIS-Raster-based data layers.

WorldView-3 imagery acquired on 8th September 2019 over Box Ridge Road, Purley. Construction of rear extension is clearly well progressed — building materials and waste are visible to the front of the property

Pléiades imagery acquired on 25th February 2019 over Peel Centre development. Construction activity is clearly ongoing — several cranes are installed on-site although residential housing appears occupied

WorldView-4 imagery acquired on 21st April 2018 over Normanton Road, Croydon. Construction of new extension to side of property has clearly commenced — building waste evident to the front and rear

GeoEye-1 imagery acquired on 2nd August 2018 over Mardyke Estate development — roofs of buildings are under construction and heavy machinery / cranes are clearly visible on-site

Preliminary analysis demonstrated prospective future role for satellite image analysis within urban planning and development applications. From unique perspective of Earth-orbit, it is feasible to routinely monitor and quantify changes in urban fabric across entire cities at regular intervals. Combining geospatial information from disparate sources at different spatiotemporal scales — for example, satellite imagery, demographics, housing returns, flood model output, etc. — is vital for informing policy decisions and monitoring sustainability of urban development.

Individual use cases demonstrate the feasibility of monitoring construction project timelines through visual inspection of high-resolution satellite imagery. Identification of key milestones and their approximate dates — for example, commencement of works, laying of groundwork and brickwork, installation of heavy machinery / cranes, roof installation, waste removal and site landscaping — provides useful confirmatory and supplementary information for urban planning office systems to assist visibility and enforcement. Due to pan-city nature of satellite imagery. it is possible to rapidly review status of any planning site across the Greater London region — and its context and impact within local neighbourhood.

Use case analysis also identified several limitations in resolving high spatial detail in satellite imagery acquired over high-density urban environments. Shadows cast by high buildings often obscure information content, preventing identification of visual markers indicative of construction activity — exacerbated during winter months due to low solar elevation angle. Sporadic availability of satellite imagery when monitoring construction project timelines — optical imagery is often impacted by cloud contamination at temperate latitudes.

Utilising the latest open source technologies, the study successfully demonstrated that high-resolution satellite imagery may be seamlessly imported into existing open-GIS desktop environments for analysis and visualisation via one-to-many client-server model. PostgreSQL database server — combined with PostGIS and TimescaleDB extensions — provides a scalable and highly extensible platform for managing optimised access to geospatial data warehouses and a functionally rich, spatiotemporal analytics API.

Leveraging non-functional benefits of open source approach — transparency, extensibility, security, collaborative-based development, code reusability — whilst minimising vendor lock-in and procurement costs is central to UK government digital strategy. Procurement overheads are a key concern for local government — the material cost of integrating the GIS-based workflow outlined in this study into urban planning operations is effectively zero.

Appendix

Source code, notebooks and data files created within the scope of this project are available to clone from the following GitHub repository: https://github.com/chris010970/gla. Code hosted in this repository implements functionality to programmatically load GDAL-supported datasets into PostGIS data tables and extract subsets of imagery satisfying temporal and spatial constraints using SQL API.