Data-oriented design is more than just performance
Ever since DOTS (Data-Oriented Technology Stack) was introduced in Unity last year, there has been a lot of talk about how it enables a new level of performance that was previously very hard, or just impossible, to achieve in the engine. This is very true but, after having used it in a few personal projects, I believe data-oriented design is more than that. This paradigm also carries several benefits with it that have more to do with the software engineering side of our craft than with purely performance-related aspects.
It is easy to add new systems and replace existing ones
With data-oriented design, it is trivial to add a new system without touching existing code and it is very easy to replace an existing system with a new one by disabling the original and making sure the inputs and outputs of the new system match those of the original. This has clear benefits in terms of testing, maintainability and moddability. Doing this in traditional, object-oriented codebases is not impossible but tends to be rare in practice.
I believe a big part of the problem with object-oriented languages is that they do not promote a clear separation between data and behaviour (quite the opposite). Non-diligent codebases tend to grow up into this big mess of non-explicit dependencies that are hard to track, understand and maintain. Data-oriented programming forces you to think about your data first and foremost: what it is, how it is processed and how it flows between the different stages of your program.
At its core, data-oriented programming seems to be about re-discovering some of the fundamentals of computer science.
Composition over inheritance by default
It is well-known that abusing inheritance in traditional object-oriented languages yields rigid designs that are hard to modify and extend (also, God objects). This is the reason behind the standard recommendation of favouring composition as a better, saner alternative.
Data-oriented design is inherently all about composition. The paradigm forces you to think about the entities living in your world as sets of components. With components being value types, inheritance is simply not allowed for them to begin with. Is this limiting? Not at all. You can still have a form of ad-hoc polymorphism. Let’s consider a stat system for a role-playing game. You could have a Stat component:
And tag components for specific stats, like:
So you could have generic systems operating on all stats (meaning all entities with the Stat component) and specific systems operating on specific stats. Note that you could also express this by making all your stats implement a IStat interface.
Does this mean inheritance is forbidden? No; you can still use it where it makes more sense to do so. Consider systems, where it is useful to have a base system class containing some common functionality (e.g., Unity’s ComponentSystem and JobComponentSystem types). You will probably end up with some form of base system in your game; e.g., one that loads a game configuration asset at initialization time.
Built-in event system
If you have ever worked on a game, you know every codebase ends up implementing its own unique version of an event system. With data-oriented design, you do not really need to do that, as you already have a built-in event system (sort of). In order to create an event, just create an entity with an associated component type containing the appropriate payload (if any):
Systems interested in handling this event just need to iterate over entities with the custom event component:
Simple, easy and it just works. No need to define yet another event system. You need to remember to destroy the event entity after it has been processed:
- For events with only one listener, you can simply destroy it right after (usually via PostUpdateCommands or your command buffer of choice in jobified systems).
- For events with multiple listeners, you may want to create an independent system that runs after all your game systems and destroys all processed events. You could leverage the ad-hoc polymorphism idea we discussed earlier, by having all event entities have a generic Event component in addition to their specific event type component. These entities would then be picked up by the destruction system.
Another choice that may make more sense in some situations is to add an event component type to an existing entity, as opposed to creating event-only entities. But the important idea here is that data-oriented programming is “event-friendly”.
Multiple worlds by default
The “entities live in a world” abstraction is far more powerful than what it initially may seem. You can have separate worlds for the server and the clients in a networked game, for your game’s replay system, for the different screens in your game, for procedurally generating and streaming a level, … the possibilities are endless in that department.
With worlds being siloed from each other and the programmer needing to invoke an explicit API to move entities from one world to another, you get a high degree of modularity by default.
No need for singletons and/or managers
We have all fallen into the trap of singletons at one point or another, haven’t we? I certainly have!
With data-oriented design, the traditional advice of avoiding singletons and manager classes does not apply: you have systems running in a world and retrieving a system is always possible via world.GetExistingSystem<>(). No additional cruft needed.
Do you see a common pattern in many of the characteristics discussed so far? If I were to put it in the form of a corollary, it would be something like:
Data-oriented design is abstraction-free.
This statement may be seen as contentious, so allow me to explain in more detail what I mean by it. As I gain more experience with data-oriented programming, I get the feeling that it promotes a certain sense of simplicity, clarity and explicitness that you do not usually find in object-oriented codebases, which tend to be very abstraction-heavy. Of course, in data-oriented design you still have some abstractions: the world, the entities, the components and the systems. They are just conceptually leaner, and the paradigm invites you to solve your specific problem as opposed to building layers and layers of general-purpose lines of code.
Does this mean data-oriented design is perfect and the cure to all our problems? Well, certainly not. There is no such a thing as a perfect solution in the real world and hard problems are hard. There are several misconceptions about it:
- Data-oriented design is all about performance. Hopefully this post shows a few ways in which there is more to it than that in terms of code quality and robustness.
- DOTS is all about ECS. It also has advancements such as Burst and a safe job system runtime that have profound implications in the way we write game code.
- Jobify everything. It seems a lot of the advice at the moment revolves around using jobified systems for everything, without taking into consideration if it makes sense to do so in the first place. Part of the reasoning is that, even if you do not need the parallelism, you still get to Burst-compile your code. Which is very nice indeed, but I believe a minimum degree of carefulness with regards to whether a given task really needs to run independently from the main thread is always warranted, as the cost of running the code on the main thread may turn out to be lower than the cost of preparing a job for it. On the other hand, mixing main-threaded component systems with jobified component systems may result in unnecessary stalls if done indiscriminately. On a semi-related note: it would be really cool if at some point in the future we would get out-of-the-box support for Burst outside of jobs.
Similarly, the paradigm has several challenges to overcome:
- It is new and different, and initially seems overwhelming.
- Not all problems in game development are about linearly iterating big arrays of similar data. We also have complex data relationships, graphs, etc.
- There is still a ton of work to do, particularly editor-wise, to achieve a workflow that is as intuitive and convenient as the classic, GameObject-based one. Project Tiny is the first step into improving this area.
In all fairness, many of these have to do with DOTS still being in preview. We need more examples that are not focused only on performance, but also on how to write games in this new way. Really looking forward to the exciting future ahead of us!