Mapping the world’s coastlines with satellite imagery — Part II
Part I of this blog series is here.
At the UKHO we are interested in the oceans, the seabed and the coastline — not to mention everything in and on them! In our previous blog, we (the UKHO Data Science team) outlined our quest to automate mapping of the world’s coastlines.
From that work, a satellite-derived coastline for the British Isles was created, available for free on the ADMIRALTY Marine Data Portal. Since then we have come across use cases for this data on an almost-daily basis, and the feedback we have received can be summarised into three main themes:
- Can I get satellite-derived coastline data for my local beach/town/country?
- Does the data show mean sea level?
- Can you capture coastline at high tide and low tide?
The answers at the time were maybe, no and maybe, which definitely wasn’t good enough! The feedback themes helped us redefine our approach to continued research and development on this project. In this article, I will describe the progress we have made as we iterate, and hopefully show how we’ve answered (or will answer) these questions.
Initially we created a yearly, median pixel mosaic of Sentinel-2 satellite images and used this as a base to extract one representative coastline. This worked well, and was a useful thing, however, deriving coastline in this way has some pros and cons.
- Pro. Averaged mosaics take care of a large amount of the variation intrinsic in satellite imagery — clouds, shadows, saturated pixels and so on. This makes classification a lot easier.
- Pro. You can create a mosaic as large as you want, removing the need to deal with the edges of images, and how to ‘bridge’ data over those edges.
- Con. The chance of assigning a vertical datum (such as mean sea level) to the data is lost in the averaging process used to create the mosaic. The median pixel is chosen in all areas, so the final image is a composite of thousands of images, all taken at different states of tide.
- Con. There is no way of knowing what the tidal state was when any of the images were taken, either by looking at an image or it’s metadata.
To identify which areas of the mosaic are water and which are land (i.e. to perform image segmentation), we calculate the Normalised Difference Water Index’ (NDWI) and define a local threshold using Otsu’s Method.
Again, this has some pros and cons.
- Pro. Unsupervised classification is quick, easy to understand and simple to implement.
- Pro. Works very well most of the time.
- Con. Built-up areas and (exposed) intertidal zones have NDWI values that are very close to water, resulting in misclassifications.
- Con. Even with Otsu’s method, it’s hard to pick a threshold in an automated way, particularly if there is an imbalanced ratio of land and water pixels.
- Con. The classification method is not robust enough to deploy anywhere in the world without significant manual intervention.
Assigning tidal information to satellite images has been successfully demonstrated by Robbi Bishop-Taylor, Stephen Sagar and colleagues at Geoscience Australia in their work modelling the intertidal zones of Australia. Satellite images can be attributed with a predicted tide height using the time, date and location of the images to query a global, gridded tidal model. Then, by deriving coastline over a long time-series of imagery the generated coastline vectors can be labelled with a predicted tidal height, resulting in a picture of the intertidal zone emerging and an estimate of the coastline’s position relative to mean sea level.
To choose a tidal model (there are a number of options out there, including TPXO9-atlas, FES2014 and DTU10) we teamed up with Chris Jones and Colin Shepherd from the UKHO Tides Team to assess various models against our own ADMIRALTY TotalTide software. ADMIRALTY TotalTide can be used to predict tidal heights and tidal streams at the locations of over 7,000 tidal stations, distributed globally (7,433 at the time of writing). All of these tidal stations will have been visited in order to measure in situ observations of sea level, which are then subsequently analysed to derive the necessary underlying data (i.e. the harmonic constituents and/or time and height differences from a reference port) in order to compute a predicted tidal curve.
We could use ADMIRALTY TotalTide to provide the satellite image tide predictions, however using a gridded, global model is preferential for our purposes to obtaining estimates using in situ stations because:
- In-situ observed data may not be available at the specific location of the satellite images, thus the ‘next best thing’ is a tidal prediction.
- There may also be a lack of ‘traditional’ tidal prediction stations available in the region, thus relying on tidal stations potentially quite distant from the required area (and therefore potentially unsuitable).
- A tidal model (once validated and assured that the predictions are suitable) offers ‘seamless’ tidal predictions over the required region on a regularly gridded scale.
Assessing the tidal models against a number of metrics (these included root mean square error comparison with ADMIRALTY TotalTide, number of harmonics, ease of automation, resolution and others) concluded that FES2014 performed best in our tests. FES is available after registration on the CNES data centre website, and comes with a handy Python package.
In order to maximise the chance of getting a satellite image where the tide is at all the different stages in its range, we grab all the imagery for an area from Landsat 7, Landsat 8 and Sentinel-2 satellites (available in Earth Engine), which returns a 20 year time series. Coastline is detected on each of these images using our original classifier and, following the methodology of Bishop-Taylor et. al., we derive 10 coastline contours, covering the intertidal range visible on the satellite image series, and attributed with their height relative to mean sea level.
Here are some of the results, overlaid on ADMIRALTY charts.
These data samples are available for free on the ADMIRALTY Marine Data Portal.
To conclude, out of the two tasks identified, we have now integrated tidal information into the coastline-derivation process and can answer two of the three questions:
- Does the data show mean sea level? YES — coastlines are attributed with estimated height relative to mean sea level.
- Can you classify coastline at high tide and low tide? YES — if it has been captured on satellite imagery in the past 20 years, we can classify it.
- Will this work for my local beach/town/country? We are working on it now 😊
So, what’s next?
Up next is the creation of a geo-generalised and temporally-generalised model that performs well on coastlines all around the globe. We have been working for a few months on gathering training data and developing a deep neural network to classify images that will plug into the pipeline, replacing the original classifier.
This work is part of a wider venture into detection of marine and coastal features visible on satellite imagery, such as mangrove forests, kelp and seagrass. We’ve found there are commonalities in these image segmentation tasks, such as the difficulty in creating training data for remote sensing data.
Part III (the classifier strikes back) coming soon!