Making Ghent Theft Auto, with heightmaps. 🗻

Lucas Selfslagh
Ghent Theft Auto
Published in
11 min readApr 30, 2020
Process shot of 211 sq km in a Unity scene (yes, that is the Harbor DLC content you’re seeing up there)

It’s been a while since I’ve ranted about my monster pet project. For those of you who have been craving a new article, I’ve got a real doozy this time.

Today, we’re talking heightmaps and why they are the vastly superior choice when doing terrain in realtime engines. Fasten your seat belts and maybe even grab some snacks, because it’s gonna get real old skool up in here. 😎

Background

First things first. I know, I know: I’m a lucky guy. 🍀

I don’t have to work with 30 m accuracy Shuttle Radar Topology Mission, not even with 5 m per pixel/vertex Digital Elevation Model Flanders (a complete LIDAR scan of Flemish territory that will be used to extend the map even further later on).

God, how I miss the Space Shuttle.

No, I have the luxury to work with 4 m pixel accuracy Gent in 3D scans. Whilst I’m very thankful to have access to this data, that’s about where my luck runs out. ☘

The Gent in 3D dataset has a few peculiar design choices, causing the data to not actually be as open as it could be, and frankly: quite unusable in a realtime environment, such as Unity or the Unreal Engine.

Gent in 3D decided to share their dataset in a set of .dwg files. This format is proprietary to Autodesk software, and while it is arguably the absolute standard in urban and architectural disciplines, it is also kinda close to the least open-source file format in 3D.

You literally have to download an Autodesk product to be able to view and edit these files in a meaningful way. If you know me, that’s what I’d call a “big no-no”. 🙅‍♂️

Anyhow, I’m totally a student so let’s have a look inside these .dwg files with my trusted 3DS Max 2020 Student version!

The first caveat is one that’s pretty to spot with the naked eye: every file contains 64 meshes with no apparent benefit at all. That’s 13293 redundant OnTransformUpdate calls we can spot on file load. This gonna be good.

Gross. 🤮

Go ahead, call me pedantic and argue that a few more transforms and vertices don’t incur that much of a cost.

Consider that any CPU/GPU call that is completely stripped and eliminated at the source level is replicated across the board 211 times. Every time and everywhere the data is accessed. Ever.

Remove 240 faces? You’ve just wiped 50640 faces! Eliminate a transform, and it won’t ever be updated again, in any future client or editor session, at any given frame rate, literally: ever. 🤗

I believe this work will earn me the respect and love of many computers, especially: mine.

Disregarding the redundant object count, there are a few other issues preventing optimal use in real-time environments.

One of the 64 tiles (in one of the 211 tiles) being stripped from the redundant data.

The meshes lack consistent normal info, sometimes pointing down into the Earth’s crust. Smoothing groups have been applied to these meshes, further invisibly increasing an already intolerable vertex count (12 million vertices).

Hate to say it, but this is deeply uncool.

Getting into the bad stuff now: the meshes aren’t even sized to 1000 x 1000 meters, as advertised. As a direct result, my biggest pet peeve: all the origins are off in the range of multiple meters. 🤬

This stuff may seem trivial if you perhaps haven’t worked on this huge scale before, but trust me: you can’t count on it, let alone build applications on it.

Sorry, we’re just getting started: in my (strong) opinion, it doesn’t make a whole lot of sense to store terrain in mesh formats, even if the file format is completely open source.

Before everybody’s all, “Yeah, classic Lucas Selfslagh, again with the blind ripping on industry standards, what an obnoxious moron”, just hear me out for a second! 🙋‍♂️

The mesh data structure is a beautiful thing, which — like all beauty in the world — I know not nearly enough about. As far as this article is concerned, meshes are containers for various data types: positions, triangles, normals, uv coordinates, colors, etc.

They’re also extremely important for both rasterized and path-traced rendering. When you hear dudes and dudettes talking about computer graphics, they’re most of the time talking about rendering meshes. So, the mesh is a great and versatile way to describe the shape and appearance of something as basic as a simple cube or quad, to something as complex as a point cloud scan of a historical site. Why, even the lunar surface.

Cool thing about geodata skills: if humanity finally starts settling the Moon, just switch up the radius in your code et voila! You’re now a leodata engineer.

Meshes have also been around for a long time! I own a book about computer graphics that’s older than I am and even that thing has stuff about meshes in it.

I know: this is kinda intense, right? No sweat, we’re getting to the core of it: meshes are completely unnecessary for what we’re trying to do here, which is render beautiful, accurate, real-world terrain. Sure, they’re flexible but they also bring way too much overhead along with them.

Instead of wasting CPU cycles and precious build size on error-prone data + wildly conflicting behavior across various 3D environments (looking at you, left-handed vs. right-handed and Y-up vs. Z-up), we could also just cut the literal crap and start focusing about the only thing we care about:
the height, am I right?

How I often feel talking about these topics.

Those flipped normals I talked about earlier? Screw ‘em, already gone! Are you the one that fancied managing UV channels for 211 irregular shaped terrain meshes? Yeah, me neither. Do we need smoothing groups on flat Low Countries terrain? Should we continue to let 16-bit mesh restrictions dictate the 4 m resolution of our terrain? Let’s say we wanted to provide level to detail to these meshes. Yeah.

I guess what I’m saying is, fall in love with your hardware a little. Start caring about all the operations the things have to perform in order to draw your graphics and process your data.

Writing optimized code is one thing, but also pay attention to application load and think about storing and handling the data you really need. Reserve as much resources as possible for doing meaningful crunch at runtime.

Because, between you and me, I have strong suspicions that all my computers are a lot better at crunching than I am. 🤫

Ok. We have a total and complete knowledge about various aspects of the shape we want to represent, such as e.g.:

  • the size of our terrain slices
  • their real world locations and neighbors
  • the topology of every point in the two-dimensional grid
  • the minimum and maximum elevation
“Look ma, no mesh!”

As you see, we can already define a lot of terrain without having to deal with Autodesk products, closed-source proprietary files, redundant mesh data, vertex-duplicating smoothing groups, right vs left-handedness, up- and forward-axis, etc.

We don’t need to store redundant state, we don’t need to waste cycles and bandwidth reading and writing the same numbers between hardware components. We don’t need more numbers, we need better logic. 🧠

Logic does need a data structure to operate on. Let’s piggyback on a better one: two-dimensional arrays!

I am of course talking about images. Images don’t have all the nasty numbers that I talked about above. What images do have is a certain size, readily accessible by the dimensions of the 2D pixel array.

Yo dawg.

So yo dawg, when you take images, which are two dimensional arrays, and place them inside two-dimensional arrays of their own: those two dimensions are offloaded to the computer, so we never have to think about that stuff for the rest of our lives. ✌

We can write logic that turns pixels and coordinates into interchangeable entities, merging their behavior. This allows us to solely worry about one thing: a single friggin’ 16-bit float describing the height at a certain coordinate, anywhere in Ghent. Uh I mean, in Belgium. Any pixel! The globe. Literally, whatever! Go talk to my computer! 😁

Low-resolution color-corrected preview of one of the 211 .raw heightmaps.

(Do mind, both approaches ultimately end up as meshes when it’s rendering time. Game engines simply perform better and provide more flexibility when storing and handling terrain using heightmaps.)

Splatmaps

Another benefit of image-based terrain is the ability to mask your terrain using colors, resulting in things called splatmaps. Splatmaps enable you to control what layer is rendered where, through the use of a specialized shader that supports instanced drawing.

In traditional game design, these splatmaps are painted in a process which takes a lot of time, by skilled artists who in turn take a lot of moolah. 💸

There’s the option of masking the terrain by using slope-based and height-based rules, which you might know from software such as WorldMachine. Those geometry-based rules ultimately won’t account for every characteristic little patch of grass, wheat, forest, river bed, gravel, etc. in the Ghent area. A critical requirement for Ghent Theft Auto.

“It’s all code. If you listen closely, you can hear the numbers.”

Since I’m quite fond of bolting tech and data together in hacky ways, as it has been my job for the last five years, I’ve devised a method that makes sure I never paint a single splatmap pixel in my life. 🙃

It wasn’t that hard to come up with, since we’ve just turned coordinates and pixels into basically the same thing, right? We can now request data using pixels and paint textures using coordinates!

Don’t mind the gap(s).

The resulting terrain layers can be linked to different materials, allowing wildly different surfaces to be rendered in a lightweight way (up to 8 different materials in HDRP, and I believe Unreal Engine can handle even more with DX12 iirc).

Both engines support the instancing of 3D objects, creating the appearance of dense forests and rocky deserts, without the need for placing every single 3D object in a natural way. Using brush tools, artists can paint 3D instances onto the terrain. But, again, you still have to paint every pixel.

Well, you don’t have to when you can just ping some server for a shape file containing all the ‘forest’ landuses, right? 💅

feelsgoodmap_foliage_map.jpeg

Results

Back in the UE4 days, I was already using heightmaps but I didn’t know how to code. My terrain building process entailed manually loading the entire 2GB dataset into a 3D package, rendering a 16K pixel heightmap that I imported into WorldMachine, to slice into sections and color in using predefined masks.

A very labor intensive and error prone process. Nobody would want to keep track of an area that’s over 200 square kilometres for minute changes in grass patches, right? I kept scribbled notes of import settings and marked my progress on printed versions of the map. I wasn’t exactly looking forward to rebuilding the terrain and I kept thinking that there should be a way to automate it all, if only I know how.

Even though building my terrain pipeline in Unity took me the better part of three (admittedly eventful) days, it has already saved the project thousand of hours.

The new visualization mode from Unity’s Terrain Tools package, still in preview.

Not only does Ghent Theft Auto now have a light-weight, fully automated terrain pipeline, all Cartesian coordinates in our map now have an elevation, wherever they’re coming from.

Sampling the terrain at a coordinate and interpolating the returned float between the minimum and maximum elevation, is way cheaper than raycasting onto mesh colliders using non-deterministic real-time physics systems. You can even perform the math outside of a game engine.

The new pipeline added an accurate and efficient way to query the elevation of millions of Cartesian coordinates, whether they’re points of interest or paths for the AI to navigate on. We can just throw 2D points at the map, and have them hug the terrain, adding so freaking much to the visual appearance of this project. 🥰

And now, the numbers!

Guess which one’s the heightmaps.

I’m very glad to be able to say I was able to reduce the size on disk by 400%, whilst correcting the entire dataset and enhancing it using data from various other sources.

The entirety of a Ghent Theft Auto build is still way below the 2 GB dataset of the terrain. 👏

Workflow

So far, you seen me throwing shade onto meshes and humble-bragging my geospatial skills. Let’s go through the entire pipeline from start to finish!

  • batch download of all zips from Open Data Gent 💾
  • extract and gather all terrain .dwg files in a single directory 🛒
  • batch import .dwg files and convert into .fbx files 🦾
  • unify the normals, make them point up 🙆‍♂️
  • merge all duplicate vertices and drop vert count by 1 million 🙄
  • fix the pivot points with renewed 1 cm precision 🕵️‍♂️
  • start doing raycasts over the terrain to sample the mesh height 🛸
  • store the heights and inverse interpolate between the min/max height 📊
  • stitch prominent holes edges, by scanlining and backfilling black edges 🤷‍♂️
  • query geoservers for data pertaining to areas inside our map 🧐
  • project certain areas (e.g. forests) into the terrain data 🌳
  • project points of interest onto the terrain 📍
  • project the roads down onto the terrain 🛣

Add lots and lots of coffee in between the above steps and that’s it! 💃

If you like what you’re seeing and would like a more steady stream of development progress and updates: I am actively debating whether I should bite the bullet and try to get funding for this project, given these exceptional times, where the citizens of Ghent, and Flanders really, could use a multiplayer digital twin to meet each other in.

If you made it to the bottom, and you think I should quit stalling and get balling: leave me clap or send me a message! Anything to further coerce me into finally doing this surely helps at this point, believe you me. 😉

Anyway the cookie crumbles, the next article will explain how you can combine methods for processing augmented reality trackables and compensating client-side lag visually, in order to have real airplanes flying over the game, based on realtime flight info! ✈

Yours truly,

that weird Ghent Theft Auto kid.

--

--