Introduction to DirectX 12 Graphics Engine : Part 1
Over time, many game engines have evolved into complex systems with numerous components such as physics, audio, graphics, input, AI, networking, and more. As a result, it has become incredibly difficult for newcomers to dive into this area and gain experience. They are often intimidated by C++, millions of lines of code, and the overwhelming debugging processes typically involved in development.
Graphics engines extend beyond games. They’re widely used in animations (like Disney and Pixar), AI/VR frameworks (e.g., Nvidia Omniverse), and scientific visualizations (e.g., the Visualization Toolkit). To create virtual worlds that feel realistic, several components, APIs, and libraries come together to form what is known as a Graphics or Rendering Engine.
In this article, I will provide a basic overview of some key features of DirectX 12 for rendering. This is useful for beginners wanting to learn the API as well as those familiar with other graphics APIs who need to delve into Microsoft’s native API.
You can watch the demonstration of the engine I will be ofter refering to on YouTube and find the source code on github.
Graphics APIs
OpenGL, Vulkan, Metal, Direct3D, and native PS4/5 interfaces are all considered Graphics APIs. Technically, they aren’t just APIs; they are dynamic libraries (.dll / .so) installed on your device allowing any application to call a function or access a class from the API to draw anything on the screen.
Here are some of the most popular APIs for Nvidia graphics cards for Windows:
DirectX
- DirectX 9
nvumdshim.dll
for user-mode driver shim layer. - DirectX 10/11
nvd3dum.dll
(32-bit user-mode driver) andnvd3dumx.dll
(64-bit user-mode driver) - DirectX 12
nvldumdx.dll
(generic user-mode driver for DirectX 10 and above, including DirectX 12)
Vulkan
- 32-bit systems:
nvoglv32.dll
- 64-bit systems:
nvoglv64.dll
These DLLs are used to handle Vulkan API calls.
OpenGL
- 32-bit systems:
nvoglv32.dll
- 64-bit systems:
nvoglv64.dll
The Vulkan and OpenGL files are similarly named because Nvidia often uses the same driver to handle both APIs.
Direct3D 12 and Vulkan both considered very complex APIs that share a lot of concepts. You can even find article highlighting that.
However, with this complexity comes great responsibility.
In both of these APIs, you must manually control synchronization between command executions, precompile root signatures and pipeline states, and manage resources and descriptors with fine-grained control to optimize memory usage.
In contrast, OpenGL and DirectX 11 simplify much of this process by abstracting away the complexity, although they only allow single-threaded execution on the GPU.
Operating system interface
Before starting, we must consider which operating system the engine will run on, as different systems (Linux, Windows, MacOS) have different interfaces, functions, and libraries for creating windows and managing rendering contexts.
There are two main approaches:
- Using libraries like GLFW: This abstracts the differences between operating systems, allowing the same code to work on any platform supported by the library.
- Native API calls: This involves directly using Windows or Linux APIs, sacrificing portability but offering greater control.
The first approach may not support specific graphics APIs like DirectX, or may not offer versatile control over your window, so let’s go with the second one on Windows operating system with DirectX 12 for backend.
Handling Window
In all APIs, the process of creating a window may come in different forms, but the fundamental point remains the same.
After specifying its parameters and creating the window, we receive a pointer to the object itself, or a handle that identifies this specific window — like hwnd
in DirectX. This handle allows us to receive events, or messages, from the window.
Keyboard inputs, mouse clicks, touchpad movements, close/collapse window buttons, etc., are all initially handled by the operating system and then transferred to the application that is awaiting these events.
In DirectX, for example, we can create a window for which we specify a function that will be called on each input event:
Input
Even so this component more often a separate huge chunk of code with it’s managers, classes and it stands alone from Graphics and Audio, here I must mention that it must persist in even smallest graphics engines where user input is supposed.
Graphics card interface
Every API must interact with your graphics card for tasks such as resource allocation, debugging layers, pipeline execution, or asynchronous computations. For this reason, you may need to operate across multiple graphics cards and monitors.
In DirectX 12, several key components make this possible:
- DXGI Factory: An abstraction interface that helps you enumerate your adapters (i.e., graphics cards and monitors). It also supports transitions between fullscreen and minimized screen modes, and, importantly, binds your window to a swap chain.
- Adapter: Represents your physical graphics card. Through this interface, you can retrieve information such as memory, device ID, feature support, performance characteristics, and more. In practice, you’ll use the Adapter interface to create another object that is used far more often — the Device.
- Device: A logical interface that, unlike the Adapter, allows you to directly operate on your graphics card. It enables you to allocate and manage resources through descriptors, create pipeline states, root signatures, and perform various other tasks.
Note: In the source code, you can refer to the ODevice
class, which utilizes all of the abstractions described above.
Command queues
Once your adapters are ready, the next step is to determine what you need to execute on the graphics card (specifically, the device).
In DirectX, this process is handled by separate objects called the Command Queue and Command List. These objects are responsible for writing commands into GPU memory and executing them one after another.
Command queue executes command list like that:
After writing all the necessary instructions to your command list and preparing it for execution, you need to call Close
on it and then pass it to the queue.
GPUs support parallel execution relative to different queues, which means you can use multiple queues simultaneously. For example, you can dedicate one queue for graphics pipeline execution and another for compute pipeline operations.
But you may ask:
Why do we need them? Why not execute commands directly on the device?
The primary challenge with GPU execution is synchronization between the GPU and CPU. By maintaining a sequence (or queue) of instructions in GPU memory, you can specify a point in the command sequence where the GPU must wait before proceeding. This synchronization point is called a Fence.
Fence is also useful when you need to synchronize queues on GPU.
Once the GPU has executed all instructions up to that point, you can resume execution on the CPU side.
For example, suppose our command queue contains a set of instructions to render a scene from start to finish. We may need to wait for execution to complete before moving to the next frame.
You can see the code in the engine’s repository here.
Conclusion
DirectX 12 is not as complicated as it may seem at first glance. Like any graphics API, it defines a way to interact with a graphics card, but the underlying principles remain the same across different APIs.
You may even notice similarities between how CUDA and DirectX’s compute shaders operate.
In this article, we covered the basics of window setup, input handling, graphics card interaction, and command queues in DirectX 12. By mastering these core concepts, developers can efficiently harness GPU power for realistic rendering.