CreaturePack: High Performance 2D WebGL Character Animation with WebAssembly

I am the author + creator of the Creature Animation Tool. Creature is a 100% mesh-based high quality Procedural 2D Character Animation solution that exports results for use in game engines, film and other forms of digital media. This article is about the process I went through porting the core part of the CreaturePack 2D Animation Runtimes into WebAssembly using C/C++.

Live 200 Dinosaur WebAssembly + WebGL Demo

Of course before we begin, I am sure you want to see some results so here they are.

This demo shows 200 mesh deforming raptor dinosaurs running across the screen. :) I do recommend you get the latest copy of Chrome or Firefox to view this demo optimally. WebAssembly requires the latest + greatest browser support, so check your browser first.

Some statistics running on Chrome:

  • Each Dinosaur Mesh has 1800 points
  • 200 Dinosaurs = 1800 x 200 = 360000 points
  • i7 6700K, iGPU: ~60 FPS
  • i5 6400, iGPU: ~60 FPS
  • Macbook Air 2014: ~60 FPS
  • Sony Android Experia F5121: ~35–45 FPS

Introduction + Motivation

Creature exports to a wide variety of Game Engines, including the popular ones like UE4 and Unity, as well as a variety of WebGL frameworks ( PixiJS, Phaser, BabylonJS and Three.JS ). Creature accomplishes this by exporting out a number of asset data formats which can be read in by Creature Runtimes available in the various frameworks. For a full list of runtimes supported by Creature, please head over to this site.

One of the often used runtimes for Creature is the PixiJS WebGL runtime. PixiJS is a high performance 2D WebGL-based Game Engine that is popular for authoring beautiful, responsive web apps, games or any form of web-based media that requires great looking animated visuals.

For the WebGL runtimes like PixiJS, Creature exports to a format called CreaturePack. CreaturePack is a fast, compact binary mesh-based export file format specially designed for deployment on web, mobile and any platform where memory, file size+ performance are critical. CreaturePack enables very fast loading of animated deforming meshes via lossy animated mesh compression schemes, very useful for web deployments. Of course, load times are just half the story, the other part involves real-time playback performance on different devices. This is where the story gets more interesting.

The initial development for the Creature WebGL runtimes was all done in Javascript. Performance, especially for a 100% deforming mesh-based 2D animation tool, was actually pretty decent. Of course, frame rates tend to drop as more and more characters are put into a scene. It has always been a dream of mine to bring the performance of Creature WebGL runtimes on par with its equivalents on native Game Engines like UE4 and Unity. For example, here you can see 200+ Deforming Dinosaurs animated and exported out with Creature running live in UE4 ( complete with lighting effects like Global Illumination, Shadows and Normal Mapping ) :

Another example, again from the UE4 runtime, shows a full game scene complete with effects like fire, water, smoke etc.:

Here are more examples running in Unity and UE4:

The Solution: WebAssembly

So the real question is: Can we have the same amount of complexity, or at least try to match near native Game Engine performance for a WebGL runtime?

I had been thinking about this issue for a while and then stumbled across WebAssembly. WebAssembly is a new portable, size- and load-time-efficient format suitable for compilation to the web. Coming from a C/C++ background ( I prefer coding in C++ over other languages given its power and performance ), this was very appealing to me. WebAssembly is being designed to support C and C++ code well, right from the start in the MVP. Very exciting indeed! With that piece of information, I started to port the core part of the CreaturePack framework into WebAssembly.

Since WebAssembly is rather new, the documentation around the web is still a bit sparse ( compared to the more mature frameworks out there ). Having said that, the documentation on their official website is more than enough to get you started. The first thing one has to do is to grab and build the Emscripten Compiler from source. Emscripten is a LLVM-to-JavaScript compiler. It takes LLVM bitcode — which can be generated from C/C++, using llvm-gcc (DragonEgg) or clang, or any other language that can be converted into LLVM — and compiles that into JavaScript, which can be run on the web (or anywhere else JavaScript can run). Of course, with the advent of WebAssembly, Emscripten now also supports compiling into WebAssembly as well via a compiler flag :)

Tools

I work predominantly in the Windows environment since most of the Game Engines ( UE4, Unity ) are more convenient to work with on Windows. Here are the tools I recommend/useful when developing WebAssembly:

Of course, your personal choices of tools will vary but the above I find are quite productive for me.

Framework/Design

For CreaturePack, the aim of the WebAssembly portion was to accelerate the core computation/processing portion of the framework. The idea was to let native C++ code do the math intensive computation and still play nicely with the Javascript based renderer in PixiJS. The PixiJS layer will simply be calling into the WebAssembly layer for computation operations, get the results back and render them as per normal like other PixiJS objects. This enables seamless operation(s) as far as users of Creature and PixiJS runtimes are concerned.

Because I was solely focused on accelerating the core framework, it also meant that the number of files needed to actually compile was relatively small. I opted to go with a much simpler Makefile setup ( as opposed to a more general but more involved setup like cmake ) In any case, I think showcasing how a simple Makefile project is setup for WebAssembly compilation will be useful for anybody starting out. Without further ado, here is how the CreaturePack WebAssembly Makefile looks like:

CC = emcc
MP_CPP_FILES = $(wildcard mp.cpp)
PACK_CPP_FILES = $(wildcard CreaturePackManager.cpp)
INCLUDES = -I$(CURDIR)
EOPT = WASM=1 ALLOW_MEMORY_GROWTH=1 TOTAL_MEMORY=524288000
EOPTS = $(addprefix -s $(EMPTY), $(EOPT))   # Add '-s ' to each option
OBJS = mpcppfiles.o packcppfiles.o
CXX_OPTS = -std=c++11 -O2 --bind
LINKER_OPTS = -O2 --bind
mpcppfiles.o: $(MP_CPP_FILES)
$(CC) $(MP_CPP_FILES) $(EOPTS) $(INCLUDES) $(DEFINES) $(CXX_OPTS) -o mpcppfiles.o
packcppfiles.o: $(PACK_CPP_FILES)
$(CC) $(PACK_CPP_FILES) $(EOPTS) $(INCLUDES) $(DEFINES) $(CXX_OPTS) -o packcppfiles.o
all: $(OBJS)
$(CC) $(OBJS) $(EOPTS) $(LINKER_OPTS) -g -o creaturepack-wasm.html
# Cleans up object files and build directory
clean:
rm *.o
rm *.wast

I assume that you are already familiar with most of the basic Makefile concepts, but I will highlight a couple of areas that are worth paying attention to.

  • Firstly, make sure the compiler ( CC ) is set to emcc, which is your emscripten compiler.
  • Make sure you have WASM=1 set for the options, this emits WebAssembly code
  • If you foresee memory growing quite often in your application, you should set the ALLOW_MEMORY_GROWTH=1 option
  • Use the — bind option to make sure you can employ the Embind framework to expose your C++ classes into the Javascript world ( more on this later )

The full contents of the WebAssembly project for CreaturePack can be found here. The first step was to get most of the core C/C++ files compiling and that turned out to be very easy: it just worked out of the box :) In order for Javascript to interface with the C/C++ code, I wrote and extra layer called CreaturePackManager. CreaturePackManager is responsible for mashalling data between Javascript and C/C++. It also exposes the core animation functions to Javascript and allows data to be passed in a more efficient way between the 2 worlds.

Dealing with arrays/raw pointers

One of the most important things for the CreaturePackManager to process initially is to be able to instantiate a new Creature animated character given a an array of bytes read in from some source. That array of bytes represents the exported CreaturePack character asset. This is where things got tricky starting out. I had chosen to use the Embind framework to expose my C/C++ classes + functions to Javascript. Embind reminds me of Boost.Python. Indeed, in their documentation page, they do mention that they were heavily influenced by the design of Boost.Python. Overall, it is a pretty decent framework to use if you want to expose your C/C++ classes to Javascript. However, the current Embind implementation makes it quite difficult to perform a relatively simple task of letting your method/function take in a raw pointer. I needed to do this because I was processing an input byte array which was one of the arguments to the load character method. After much searching on the internet, I found that the solution was to do something like this. First, declare an integer representation of a pointer:

typedef uint32_t rawPtr_t;

Next, for my load function, I had something like this:

// Adds a new CreaturePack Loader given a typed byte array                               bool addPackLoader(const std::string& name_in, rawPtr_t data, int length);

The actual method then reinterpret casts the integer back into a pointed like this and I am off to the races:

const uint8_t * raw_data = reinterpret_cast<const uint8_t*>(data);

While this seemed like a somewhat roundabout way of doing things ( since raw pointer parameters are not yet supported ), it worked for all intents and purposes.

The implementation above takes care of processing an input representation of a byte array using a raw pointer on the C/C++ end. It turns out some additional work needed to be done on the Javascript layer as well. In particular, the typical scenario starting out was to receive the character asset data as a UInt8Array:

var byte_array = new Uint8Array(response);

I had initially wanted to pass this directly into my C/C++ routine and that did not quite work out. It turned out you needed to allocate heap memory using the WebAssembly Javascript API to allow a byte array to be passed into C/C++. I wrote a useful utility function to accomplish that:

CreatureASMUtils.heapBytes = function(typedArray) {
var numBytes = typedArray.length * typedArray.BYTES_PER_ELEMENT;
var ptr = Module._malloc(numBytes);
var heapBytes = new Uint8Array(Module.HEAPU8.buffer, ptr, numBytes);
heapBytes.set(new Uint8Array(typedArray.buffer));
  return heapBytes;
};

This takes a regular typed array and returns it as another typed array but this time with data stored in heap memory allocated using the WebAssembly’s Module API. Using the object returned from that helper function, you can then proceed to pass the array over into C/C++.

Reducing Data Copying Overhead

The other key part to the whole project was of course runtime playback performance. In particular, I wanted to avoid as much data copying overhead as possible. The idea was to keep the computed data ( points, indices, uvs etc. ) within the native layer and if possible, just return a reference/pointer back into Javascript.

If we are to take a look at the Javascript layer, one will notice that the rendered vertices are typed arrays. In particular, points are represented as a Float32Array . With that in mind, I proceeded with my plan to reduce the impact of data transfer, post computation, for the CreaturePack runtimes.

The key to the whole idea was to write a wrapper method called getPlayerPoints . The actual implementation looked like this:

emscripten::val 
CreaturePack::PackManager::getPlayerPoints(int handle)
{
const auto& player = pack_players.at(handle);
  return emscripten::val(
emscripten::typed_memory_view(
player->getRenderPointsLength(),
player->render_2d_points.get()));
}

In this method, you will notice that it returns a memory_view pointing to the actual 2d point data. So this allows the client calling in to peek and retrieve the required mesh vertices without any copying required. At the Javascript level, the implementation is just this single line:

this.vertices = manager_in.getPlayerPoints(this.playerId);

After that, no other assignments or data copying are required! As the animated meshes of the characters are deformed over time, the WebGL Javascript portion always reads from the most up to date data streamed in from the C/C++ layer :) This is an important bit with regard to performance optimization.

Conclusion

Overall, I am quite happy with the results of moving the core CreaturePack computation over into WebAssembly. The demo provided puts 200 fully deforming animated dinosaur meshes onto the screen at once. Even on a moderately equipped modern laptop PC, a decent 60 FPS is possible. I can imagine most games will not be attempting to put 200 complex fully deforming meshes on screen so this demo is definitely a decent performance test of the new CreaturePack WebAssembly runtimes.