Mana Engine: Dynamic Tasks

Timothy Lochner
7 min readSep 5, 2018

--

In the very first post I wrote about the Mana Engine, I talked all about Systems and how the engine uses a system’s read and write access to components in order to guarantee thread safety.

These systems ultimately generate a graph. In that graph, every node will only execute as soon as all nodes it depends on are complete. Here’s a graph of a test game we wrote that simulated Boids.

Just to pick one example out, look at DebugUI. That system won’t run until Input and PerfStatRecorder have both completed.

In reality, DebugUI may also have other dependencies. For instance, DebugUI uses the Time singleton component, which stores the deltaTime. That component is written to by TimeKeeper (hence why it’s at the beginning of the frame). Because DebugUI also depends on TimeKeeper through PerfStatRecorder, the graph is simplified and DebugUI’s dependency on TimeKeeper is removed.

But the reality is that you can’t always have a fixed number of tasks per frame. Especially for systems that work over lots of components, such as all of the Boid systems, you need a way to go wide on all those boids. And when you’re done going wide, you may have some other tasks to complete.

That’s what this post is about. Tasks that get added to the graph dynamically, and removed at the end of the frame.

Remember, a tenant of the Mana Engine is that multi-threading needs to be easy. Easier than falling on your face.

So here’s the easiest it gets to spawn a task.

TaskDelegate work = []()
{
DoSomeWork();
};
AddTask("My Task Name", work);

When this task spawns, it maintains system execution as well. What that means is that for any given system, any task spawned by that system will prevent any down-stream systems from running until those tasks are done.

This is important so that whether an engineer writes code in-line or as a spawned task, I know that it’s order of execution is maintained.

To give some visual representation of this, here’s just a small portion of a full frame graph.

And here’s what it looks like after we spawn two tasks from SystemA.

You will notice that SystemA_TaskA and SystemA_TaskB run in parallel with both each other and with SystemA. So if you spawn two tasks at the top of a system, they spawn immediately, and you can continue to do work inline in the system.

Let’s say we wanted SystemA_TaskB to run after SystemA_TaskA — not in parallel. Whenever you call AddTask(), it returns a TaskId, that can be passed into other calls to AddTask().

TaskId taskA = AddTask("TaskA", TaskDelegate() = []()
{
//Do TaskA's work.
});
TaskId taskB = AddTask("TaskB", taskA, TaskDelegate() = []()
{
//Do TaskB's work
});

And now the graph will look something like this:

Pardon WebGraphViz’s visual generation. It’s a great tool, but I use it quite primitively. The point is that SystemA is running in parallel with TaskA and TaskB still, but TaskB runs after TaskA.

Because we don’t make any attempts to simplify dynamic tasks, you can see redundancies in the graph — notably, where SystemB depends on SystemA_TaskA both directly and indirectly through SystemA_TaskB.

Tasks can also depend on more than 1 other task. Which gives us the last overload of AddTask() — one that takes an array of TaskIds.

//Spawn 5 tasks.
Array<TaskId, 32> tasks;
for(uint32 i = 0; i < 32; i++)
{
TaskDelegate wideWork = [i]()
{
DoWideWork(i);
};
tasks.InsertLast(AddTask("Wide Tasks", wideWork));
}
//Some work we're going to execute after finishing going wide.
TaskDelegate finalWork = []()
{
DoWork();
}
//Kick off that task.
AddTask("Final Work", tasks, finalWork);

Showing what this graph looks like would visually be a bit much, but you can see what’s happening. Ultimately SystemB and SystemC won’t run until after the “Final Work” task, and that won’t run until all “Wide Tasks” are done.

The above example of going wide is overly simplistic compared to a practical example. We don’t want to go wide on 32 items. We want to go wide on 32 thousand items. You can’t be adding 32,000 tasks. It would create way too much overhead for the work being done. Instead, you want to batch the work up, doing say… maybe 500 of them in a single task, giving you 64 tasks.

Depending on the work being done, the number of cores you have, and the number of iterations to make, you’ll want to tweak how many go into each batch. But to write all that code for managing the batching would be a pain.

That brings us to the ParallelFor helper function.

In reality, the ParallelFor is nothing special. In fact, it’s not even really a thing at all. It’s just a function that does all the management of the tasks for you. Here’s how you’d use it, adapting the above loop to work over 32,000 items.

ParallelForDelegate wideWork = [](uint32 i)
{
DoWideWork(i);
};
TaskId parallelTask = ParallelFor("Wide Tasks", 32'000, 500, wideWork);
TaskDelegate finalWork = []()
{
DoWork();
}
//Kick off that task.
AddTask("Final Work", parallelTask, finalWork);

Woah. That got really straightforward. Even if we weren’t managing batching up 32,000 iterations, this is still more straightforward than writing the for loop and managing the tasks ourselves in an array on the stack.

One thing that might seem weird — why do all tasks need a name? The answer is debugging!

In Mana Engine, we have profile hooks on ever task, and because all tasks have a name, those profile hooks all automatically come with a name attached to them. That allows us to use Google Tracing to do things like this:

The above picture is a single frame in our test Boids project. It’s also running tasks for updating the transforms of tens of thousand of objects.

I think this is the first time I’ve shown a Google Tracing graph of Mana Engine, so let me point out a few things, even though they’re a bit off topic:

  1. In this particular profile trace, I’ve limited the game to only using 4 cores, even though I have 6 on this machine. That limit is a runtime variable you can edit from something like an options menu.
  2. There is no “main” thread. Of course the program has a main thread — all windows app do — but the main thread just checks for input from the windows message pump and then works on tasks, just like any other worker thread. The present call just happened to show up on thread 3 this frame, but next frame it may have been picked up by the “main” thread and it largely wouldn’t matter.
  3. Notice the Present call is at the beginning of the frame. That’s actually last frame’s present call. Because we use graphics command buffers, we can begin working on the next frame while still presenting last frame’s rendering data. Right now we’re using D3D11, but if you’re familiar with some of the key features in D3D12 and Vulkan, you’ll know why being able to do this is important.
  4. There’s a lot of tasks by the same name. That’s where we’ve gone wide to do work. Notably, updating the boids and tens of thousands of rotating objects.
  5. There are still some gaps in the frame, especially at the end.
    The pinkish calls toward the end of the frame are updating some graphics buffers and they could do a better job of going wide in order to better fill out the gaps.
    The mint green task at the end is the task that takes ImGui’s data and pushes it to the GPU buffers. I’m not a fan of how long it’s taking, but I also haven’t looked into why it’s taking so long. The important thing is that I can easily see where the gaps are and what might be done to address them.
  6. One of the gaps at the beginning of the frame is caused by that deep blue task. That’s cleaning up last frame’s dynamic tasks. If you ask me, it’s taking too long. There’s some optimizations to be had there.

I had to bring up Google Tracing to justify the names on dynamic tasks, but I also want to point out what a touchstone it is. I am constantly looking at traces of the game. It gives you such insight into whats going on with the application that it changes the way you think about your program. If you have a project that isn’t already using Google Tracing (or something like it) do yourself a favor and implement it — it doesn’t take long and will do wonders for you.

There you have it. Dynamic task creation.

Next post I think is going to cover one of the newer features put into Mana Engine that builds on top of the concepts in this post. Be sure to watch this space or my twitter account (@tloch14 ) for when that post is published.

--

--