iOS hardware accelerated 2D terrain visualisation in OzRunways

Rowan Willson
9 min readJan 25, 2022

--

A look at Apple Accelerate framework

Motivation: I wanted to write this article as there aren’t many resources around how to use the Accelerate and vImage / vDSP frameworks, especially regarding how they can be pieced together into a large data pipeline.

Our goal at OzRunways was to make a way of displaying terrain during in-flight and planning. This is important to a pilot who may be flying in cloud or at night. The terrain overlay should be placed over the map such that it’s intuitive to identify terrain features with minimal map obsfucation. Colours indicate the danger and should quickly fade out to fully transparent when the terrain is greater than 1000ft below, which is typically a safe margin.

The problem with terrain is has a very large range (-20 to +30,000 ft) and you’re trying to represent this entire range with maybe 256 colours in a gradient. If we divided 30,000ft by 256 we would get around 120ft per (very slight) colour variation; not ideal, and difficult to see small changes in terrain relief. Luckily a pilot is only interested in +/- 2000ft of their current or planned altitude, primarily terrain immediately below (within 1000ft), plus anything above, which is all an equal threat.

We could try truncating outside of this range, this but would lose detail and make mountain tops appear sliced off. To achieve the desired result we will need to compress altitudes outside of the interested range using the sigmoid function to map all possible terrain values to 0–255 with 128 being our current altitude, to maximise detail around our altitude, yet still display some terrain relief to make out mountain shapes significantly above us.

Adjust the parameters to change the slope as required. This takes a lot of testing to get right.

Another requirement is performance. For adequate real-time performance for the first rendering pass, we need to render a terrain tile in about 200ms. This allows 5–10 tiles on-screen to render in around 1–2 seconds. For this we will need to use Apple Accelerate framework. We will later cache these tiles so they can be reloaded in ~10ms.

Initial Data

We start the rendering pipeline with raw SRTM terrain data from NASA. These tiles are 1201x1201 pixels as Big Endian, Int16 (2 byte) pixels, row-major ordering representing elevation in metres. If you load up some NASA Terrain as a grayscale bitmap it looks something like this:

SRTM source data

The first thing we need to do is convert the Big Endian Int16 source data to Host ordering (typically Little Endian on arm64). Whilst here we also apply some fixed corrections.

As you can see — we really want to do this .withUnsafeMutableBytes to get a pointer to rapidly scan through the bytes. This part is not hardware-accelerated and is one of the slower passes, but is required once to setup the data in the correct endian format. It would be useful if Apple had a hardware accelerated pass for Endian swaps.

Hardware Accelerated Rendering pass

One of the most confusing parts of working with Swift pointers is understanding the different pointer types and how they interact. I find it useful to name the pointer variables to match its type.

First up, almost nothing in Accelerate framework works with 16 bit values, so we need to map this to UInt8 Planar8 data as quickly as possible. We do this by converting to a buffer of UInt16 using a lookup-table to map raw terrain values to values in the range 0–255.

Line3 above defines the source buffer to point directly at the base address of the original raw terrain data. This allows for in-place updating of the data — great to avoid unnecessary memory allocations. Next line10 creates a 65k array of values to map source values to destination buffer. Line18 does the actual pass. There are some clever optimisations you can do to the above with real-world data — especially if your source data does not contain all possible Int16 values. Also consider alignment for your for loop in order to help the compiler unroll and use SIMD instructions here.

Next we want to really reduce our data down to 0–255 UInt8 pixels. We’re still operating here with raw grayscale pixels, that have now been normalised and compressed to 8 bits per pixel by discarding a lot of information far away from the pilot’s altitude using the sigmoid function. Line1 creates a new format defining our grayscale UInt8 format (with no alpha channel). Line9 creates a new buffer to store our new grayscale data, then line19 does the hardware accelerated conversion using vImage.

Progress — the grayscale buffer is looking something like this

Adding Colour

This is where we see some missing pieces of Apple’s Accelerate framework and how we deal with this.

Our desire is to end up with an ARGB8888 buffer where each pixel is 4 bytes that goes [ARGB ARGB …]

However there is no conversion from Planar8 to ARGB8888 using a lookup table. But there is a vImage function for Planar8 to PlanarF (32bit float). So what we will do is trick it by creating a mapping from 8bit to 32bit float values, but the floats just have the bit pattern we need for our ARGB8888 value. We do this by extending Pixel_F with a custom initialiser with four UInt8 RGBA values:

We can then create a lookup table that maps from our normalised greyscale terrain pixel values (0–255) into a nice 32bit ARGB colour pixel. The colorPallet is an array of 256 colours you have selected, eg a gradient from green to red.

To actually do the grayscale to colour conversion: Line2 defines the ARGB (Alpha channel first) buffer, then line9 creates a new allocation. Don’t forget to release it with line12. Then line14 does the work mapping our raw grayscale UInt8 pixels to our new colourful 32bit ARGB pixels using the lookup table of custom Pixel_F values.

Looking better so far, but not quite there yet

Full Pipeline

This is where things get tricky and we need to step away from code into the pipeline. By this stage you should be familiar with creating buffers, image formats and using vDSP and vImage on buffers. We are up to the first yellow box.

Full terrain image pipeline

The newly created [ARGB] buffer needs to be de-interlaced into its component ARGB channels as separate buffers. The Alpha channel is set aside and merged in later.

The RGB channels then need to be converted to floats to perform math on (unfortunately you cannot subtract UInt8 buffers from one another), so they first get converted to Float32 using vDSP ready for math. At this point, a shadow buffer created elsewhere (more on this later) is subtracted from each R G B channel to darken edges. OzRunways actually does multiple subtraction phases on multiple shadow buffers here to produce hillshade edges in all directions which makes the final image look a lot nicer.

The R G B float buffers are then clipped to bring them back to 0–255 range (in case subtraction underflowed), converted back to Planar8 (UInt8) then re-interleaved back to CGImage supported ARGB format.

Optionally the image can be scaled (for example if zoomed out), then made into a CGImage which can be displayed on-screen or converted to PNG to cache to disk.

In total there might be around 30 vDSP and vImage passes through moderately sized buffers (1–6mb) in memory, plus some CPU operations creating the mapping tables. The Apple Accelerate framework passes are VERY fast — mostly around 2–10ms each.

Shadow Buffer(s)

The actual hill shading math is heavily inspired from this fantastic article: I highly recommend you read it:
https://www.staridasgeography.gr/kilauea-volcano-shaded-relief-or-edge-detection/

When we create a shadow buffer, we want it to be mostly black for flat terrain areas, and greyscale for steep slopes. These values will be subtracted from the source R G B channels to darken them.

To do this, we use a 3x3 convolution kernel that will find gradients. The way this kernel works is it is passed along the image and each source pixel is replaced with the result of the kernel multiplied with surrounding pixels. Because our kernel center point is 0, it drops the source pixel in the output, and you’re only left with some slope value. Where the kernel overlaps the edge of the source data we specify kvImageEdgeExtend which seems to be fast with nice results.

The output from the convolution is mostly values close to zero, but where a steep edge gradient exists in the source data, it will have a larger integer value. This shadow buffer is subtracted from the RGB channels individually to darken those parts of the image. This creates a nice but computationally inexpensive hill-shade effect.

Final image with optimised colour map, alpha fade-out to the base map and 3-pass hill shading

Animation

In the production app, when the pilot altitude changes by 100ft, we render a new set of tiles for the new altitude. This is animated slightly to avoid a sharp change up/down to the next finite step.

Once it detects the pilot is aligned with a runway in the process of landing (or taking off), it will freeze the terrain at a higher altitude to avoid turning the entire map into a sea of red, but still provide some useful context of surrounding terrain.

The animation below shows stepping from 3000ft to 2000ft in planned altitude. As the pilot changes their planned cruise altitude lower and lower towards the Adelaide hills, the terrain goes from ‘caution’ yellow (within 1000ft of pilot) to red meaning danger.

This is useful as it allows the pilot at a glance to easily see their critical point of interest, off-route terrain (“probably shouldn’t turn East if I have a problem…”) and highest obstacle enroute.

The final result below is a real world plan from a coastal town (Moruya) to Canberra. The highest terrain point (and man-made obstacle) is displayed within a 5 Nautical Mile (NM) buffer of the planned track, and the terrain colour and transparency are carefully selected by the app to be able to visualise the highest critical ridge of interest in the flight path. A lowest safe altitude of 5900ft is presented as an option for the pilot to use (or adjust) after checking spot heights on the underlying topographical map. In reality a pilot would plan and fly this route at 6000ft.

Without OzRunways and this visualisation tool, a pilot would typically use the published GRID Lowest Safe altitude, encompassing the higher mountains to the west, or try to manually find the critical spot height along the route which is at risk of human error (the results of which can be catastrophic). Using the easier to find GRID safety height would result in the pilot flying at 8000ft instead of 6000ft, which during winter months in Australia puts the aircraft into the freezing level at risk of icing and associated hazardous conditions, or simply unable to descend below cloud and see the ground — which is often a problem for emergency rescue helicopters operating into the mountains.

By allowing the pilot a simple tool to easily assist them in calculating the lowest safe altitude with beautiful hardware-accelerated visualisations, they are now able to plan faster, minimise human errors, increase situational awareness and potentially fly below dangerous icing conditions.

Tips

  • Reuse buffers: don’t create a new buffer when you can reuse an old one
  • Free buffers using defer if you have multiple return points, otherwise you will leak memory
  • If you don’t need to create a new destination buffer, just use the source one and process in-place where possible (where output is same size as input)
  • Use the smallest data size possible. If you only need 0–255 values, then only use a UInt8. It’s less data to process.
  • Avoid CPU operations as much as possible. If you do, use raw pointers to traverse large data sets. You might think it’s faster to do 3 passes through the large data set and do all operations at once on cpu, rather than 30x hardware passes per operation, but you’d be (often) wrong!
  • Profile profile profile, in Time Profiler.
  • Be careful about memory layout of your data. Creating a buffer of 1201x1201 may allocate memory with junk at the end of each row out to the nearest alignment value. If you perform operations like vDSP math on this, you have some junk in there. You have been warned!

--

--