Adventures In Procedural World Generation — Building a Spherical Streaming Viewer from the Ground Up

This piece focuses more on generic front end programming, so before I lose those of you that are more interested in world generation itself, the results of my optimizations below can be found here, and a quick…

Note to the Procedural Generators. The Artists. The Mapmakers.

I have uploaded the simulation library (described in previous posts ‘Concept Overview’, and ‘First Pass Model’) for which I built these visuals. It can be found on github. I have also found myself wanting to sink more and more time into this project — to make it the most comprehensive world generator available — but unfortunately time is money, and hosting ain’t free. So if any of you are interested in supporting its continued development (here), I would be thrilled! Thanks!

But on to the technical stuff…

The Beginning

I threw together the first 3D viewer for the simulation simply to be able to see it progress in realtime. Since each time step for the simulation takes several seconds to process on my machine, efficient javascript and data transfer was not an issue. However, when attempting watch already completed simulations at a reasonable frame rate, I ran into a problem; the current code couldn’t get more than three frames per second. Additionally, it kept crashing due to memory use on the order of several gigabytes.

Fixing the Javascript

For the 3D mesh:

My original algorithm for creating the triangular display mesh used several arrays of javascript objects describing faces, edges, and vertices. For the combined 1.5 million objects, this method amounted to ~1.6GB in memory (metrics mentioned throughout this post are all from Chrome run on my ancient 2011 iMac). To reduce this requirement, I redesigned my javascript implementation to use a single mesh level object with many typed arrays for what had previously been properties on each face, edge, and vertex object. This process was relatively quick as I used a naming scheme which simply replaced dot notation for property accessors with an underscore, so “mesh.edges[index].firstVertex” became “mesh.edges_firstVertex[index]”. After this replacement, and fixing the few errors that showed up, the memory footprint of the grid dropped below 100 MB. Additionally, properties that are no longer needed after sending the mesh to the GPU can be easily dropped by setting the appropriate array to nil.

Mesh creation still takes a noticeable amount of time to the user — at ~2.5 seconds compared to the Go implementation of ~0.3 seconds — but I haven’t yet identified any hotspots in the code worth optimizing further.

For frame rendering:

Up until this point, I had been using protobuffers to transmit all my data as they are blessedly easy to describe and set up in each of the languages that I’m using (C++, Go, and Javascript). Unfortunately the javascript implementation presented a significant bottleneck when unmarshmaling a half million integers of elevation data — at 85% of the computation time for those measly three frames per second. To alleviate this problem, I implemented the data format described below. Additionally, the format allows for certain data frames (such as elevations) to be transmitted already rendered. The new format by itself decreased the workload per frame from 300ms to 140ms. Pre rendering each frame allowed me to replace a set of range ifs with a switch statement to determine which color gradient a cell should fall under. This alteration saved much more time than I would have expected, reducing the workload per frame down to 30ms. Further improvements could be made my moving the unmarshmaling to a web worker.

Designing a Suitable Data Format

This was the most frustrating programing I have done in a long time. Every decision on how to lay out data must be made so that it can be written and read consistently, but otherwise the decisions don’t matter. Should I use 4 byte or 8 byte integers or something else? Will cell counts go over 2³² often enough to matter? In the end I created a simple format which contains probably twice as many header bytes as needed, but in total they make up less than 1/10,000 of the file size, so I’m happy to leave it how it is.

The overall structure consists of a set of frame-sets, where each frame-set can be read independently, but frames within a frame-set may be encoded such that they depend on each other.

Reducing Required Bandwidth

The baseline simulation (5000 time steps) produces 15GB of data, and looks best as a two minute video. As I wanted to be able to share results without mailing flash drives, this size had to be shrunk. Since the rendered color of any given elevation is static, a 32 bit float can be reduced to a 10 bit integer — two bits to determine the color gradient, and 8 bits for the color along that gradient. As an easy addition, the majority of frames can be encoded as the difference between the previous frame and itself, resulting in the current file size of 500MB. Spacial statistical redundancy will be considered next, but requires adding the display mesh to the data format library, since we don’t have the nice rectangular grid that is available in flat video. I will also try rendering to a texture and see if the loss of information is worth a potentially smaller file size.