Why Functional Programming Works for Games

Why I set out to prove that functional programming works for games, and what it might mean for modern game devs.

Screenshots for the world’s first pure functional game engine, Nu!

A fellow on www.fpchat.com slack channel #gamedev asked me the following question about my pure functional F# game engine, Nu (source available here) -

“I was curious of how this project started. Do you have previous experience in game/engines development?”

I think it was my very painful experience working with commercial game code bases — most recently on the Sims 4 — that finally convinced me of the fundamental flaws of the imperative / object-oriented approach to game development. Those conclusions kept being reconfirmed with my successive experiences with modern engines like Unreal, Unity, and other in-house game engines.

The game that convinced me to change my approach.

The bottom line was that, in these code bases, what should have taken 15–30 minutes would typically get estimated at - and actually take! - 3-5 hours. And let me assure you almost all that extra time spend was pure pain, most of which was spent in the debugger trying to figure out just how the hell the horrifically complex system got into the relevant part of its current state to begin with.

As an engineer, I became consistently frustrated due to the complexity that seemed unavoidable with current tools.

However, while working on Sims 4, I was privileged enough to undertake what would be an ongoing conversation with one of the principle engineers on the team. It was this months-long exchange that helped me shape some of my initial ideas of the Nu Game Engine — if only as a contrarian undertaking.

Let me note that my colleague was an awesome chap personally, and gave a great deal of time to these discussions that he did not have to, so even though he argued forcefully, he was one of the nicest and most open-minded people I’ve worked with. As our conversation proceeded, the arguments he gave as to why functional programming could not work for games kept returning to the following two points -

1) The GC would create too many pauses and affect the frame rate of games. This was from his experience of using C# in the Sims 3 engine, and his experience didn’t allow him to conclude otherwise — even though the modern .NET GC was very different than the one that shipped with Sims 3.

To invalidate his assertion, I did some research on modern GC technology, consuming several white papers and a couple of books along the way.

My favorite!

After spending several weeks doing my homework (it was not easy ramping up on such an unfamiliar topic!), I continued our conversation. I suggested that the design of at least some modern GCs —such as those considered ‘pauseless’ due to their iterative nature — would eliminate the issue in theory.

When I brought this to his attention, he didn’t seem to be able to give a concrete rebuttal, so I concluded the approach would, at least in theory, work. It would have been nicer had he been willing to concede the point outright, but fortunately I was able to make up for his lack of explicit concession with my own stubbornness.

Even better, as I prototyped the engine initially, it turned out that, In practice, the .NET’ 3 (and above) GC did not have the type of pauses he worried about — even without an incremental design! As far as bandwidth is concerned, the GC maxes out at 2% of CPU usage in Nu.

Add to that the fact that many engineers foresee GC becoming part of the computer’s hardware — at least, once we settle on a universal set of GC primitives. For all of these reasons, I’d consider this a non-issue both in theory and in practice.

Some people think GC should be at the hardware level anyways. I find it hard to disagree.

2) Pure functional programming would be too slow.

This is a common concern, and a bit more valid than the other. But is it to the extreme as being suggested by my colleague? And aren’t there some caching optimizations and other workarounds that can mostly negate this concern?

Currently, the Nu Game Engine can process up to 25,000 on-screen entities at 60 FPS before saturating the CPU. For perspective, consider that the modern CPU can only handle about 50,000 particles before they need to be placed on the GPU — and particles are much cheaper than entities in any game engine.

With this number of entities on the screen at once, the performance limitations depend entirely on the engine’s structure. Consider that in order for an entity to have its current state retrieved, it must be looked up from a map. And not just any map, a purely functional map! How can we process that many entities when we have to rely on this type of data structure?

There are three optimizations that make this fast in Nu. First, we use an innovative purely functional unidirectional map, Umap, rather than the normal F# Map. While the vanilla Map’s look-up time is O(log n), Umap’s look up time is O(1) in Nu’s use case!

(It’s called the ‘unidirectional map’ because its performance is near that of the .NET Dictionary’s so long as most past instances of it are discarded — just as they fortunately are in a game simulation such as Nu!)

Check out the timings -

Some timing comparisons for the various mapping primitives. Source at https://github.com/bryanedds/Nu/blob/master/Prime/Prime/Program.fs. Umap is almost as fast as the highly-optimized .NET Dictionary!

Second, the most recently-found entity is cached by the engine so that subsequent state retrievals on the same entity require no entity look-up at all! So as far as subsequent reads go, we’re as fast as we’d like to be.

Third, there is the ability to specialize types of entities as ‘imperative’. That is, operations on them mutate the state in-place rather than copying and updating. Because imperative entity operations are in-place, their data can be cached directly in the handle, requiring no look-ups even when dealing with different entities! The above 25,000 number is for imperative entities. If you want your entities to be purely functional and work with systems such as undo / redo in the editor, you can only have about 7,000.

250 simultaneous physics boxes, and the bottlenecks are in the imperative portions of Nu, such as the Farseer Physics system and the SDL2 renderer that I have yet had time to optimize.

So, as it stands, we can say the following today with certainty -

Pure functional game programming will assuredly work for typical, non-AAA games. Concerns 1 and 2 have been demonstrated to be non-blocking both in theory and in practice (although with concern 2 — you still have to be careful at times).

The open question is — do these types of techniques — when used at the core of a game engine — work in the context of AAA games like Uncharted 4?

The static environments won’t be a problem as it’s nearly all on the GPU. It’s the dynamics that are intimidating.

I cannot answer this with full certainty. The number of things going on at once in Uncharted 4 is staggering. And yes, much of that can be done by having imperative subsystems below the purely functional engine API using data-oriented design. Data orientation is very complementary to the functional approach, just so long as its mutable parts stay below the engine-level. Nu’s design is, in fact, intended to be ‘functional at the top and data-oriented at the bottom’.

But in truth, while I am optimistic, I’m not sure what standard of proof my engine would have to satisfy in order to work for such demanding simulations. For Sims 4? I think so. For Uncharted 4 with that ridiculously under-clocked CPU on the PS4? I just don’t know… yet.

That all said, like as was done with the initial prototype of Nu, I will assert this much:

There’s only one way I know to prove that pure functional programming can work for modern AAA games — and that is to try it and see.