Testing Your Luck: Avoiding Bugs in Procedural Generators

Robert Prehn
acrossthegalaxy
Published in
2 min readSep 28, 2016

I’ve been cleaning up the star system generator and adding little features here and there. I built the original version “write only” — I just kept adding code until it worked. As you might expect, this resulted in some unmaintainable code. There were some 200 line methods and a 900 line class.

Testing procedural generators is tricky. The results change from run to run, so how can you prove that you fixed a bug and didn’t just get lucky? In my case, I was concerned that I would alter the frequency of certain planet types without noticing. Not every planet type appears in every system, so it can be hard to tell.

To handle this, I built a stat collector script. This script runs the system generator 100 times and collects information about the results:

  • Frequency of each planet type (airless rock, venusian, martian, terrestrial, gas giant, etc)
  • Frequency of each atmosphere type (none, gas giant, breathable, too thin, poisonous)
  • Which elements caused poisonous atmospheres
  • Frequency of habitable planets
  • Frequency of earth analogues
  • Average planets per system
  • Maximum moons per planet
  • Generator run time
  • Etc

I use a small editor script in my test scene to run the stat collector each time Unity compiles my code. It writes the results to a file, then my text editor live reloads that file in a side panel. That way, I see the results right after compiling.

Here’s what that looks like:

My text editor (Atom) with the code on the left and the system stats on the right

Here’s the code for that editor script:

I check the results file into version control. The downside is that I clutter each commit with a new version of the stats file. The upside is I can look back through my code’s history and see exactly where I introduced or fixed a generator bug.

It is not a replacement for unit testing. Unit tests specifically pin-point sources of error. It is not a replacement for human testing, because the player perception of a generator matters more than whether it is statistically valid. It does let me tinker with the generator with confidence.

--

--