Thirty Days of Metal — Day 13: Depth

Warren Moore
5 min readApr 14, 2022

--

This series of posts is my attempt to present the Metal graphics programming framework in small, bite-sized chunks for Swift app developers who haven’t done GPU programming before.

If you want to work through this series in order, start here.

In the previous article, we looked at how to generate and draw 3D meshes with Model I/O and MetalKit. At long last, we’re ready to start drawing 3D objects in earnest.

Thinking Like a Painter

Up to this point, we haven’t though much about what happens along the z axis. When we start drawing overlapping geometry, though, we have to consider how things are ordered along the line of sight.

Consider how an oil painter builds up a scene, first painting the visible portions of the background, then painting objects nearer the point of view. We could take a similar approach when drawing our virtual 3D scenes: make a list of the triangles in our scene, sorting them so they are ordered from far to near, then draw them in order, replacing farther pixels with nearer pixels.

This approach, called the painter’s algorithm, works as long as there are no ambiguities caused by intersecting or overlapping triangles. But there are two problems. First, there often isn’t a way to unambiguously sort the triangles in a scene. Second, we might need to re-sort the triangle list every time an object or the point of view moves, and this is expensive.

Depth Buffering

The problems inherent in the painter’s algorithm are resolved by considering the question of depth at each pixel. We use a separate texture called the depth texture (commonly called the depth buffer) to store the depth of the closest fragment we’ve seen so far when drawing. When a fragment is rasterized, we check its corresponding pixel in the depth buffer and replace the values in both the depth texture and color texture if the current fragment is closer to the point of view.

The Depth Buffer in Metal

Metal has several possible pixel formats for depth textures. We will look at pixel formats in more detail when we discuss texturing. For the moment, we can ask our MTKView to create a depth texture for us by setting its depthStencilPixelFormat property to MTLPixelFormat.depth32Float.

view.depthStencilPixelFormat = .depth32Float

Each pixel of the resulting depth texture will store a 32-bit floating-point value between 0 and 1 representing the relative distance to the nearest object. The depth texture will be cleared to 1.0 each frame by default, so this value represents the absence of objects (the nearest object is “infinitely far away”).

As we draw objects, they replace the values in the depth buffer. Below is a visualization of a depth texture after a sphere has been drawn. Note the contrast between the center and edges of the sphere: darker values are closer.

Depth-Stencil States

Configuring an MTKView to produce a depth texture causes the view to populate the depth attachment of the render pass descriptors it vends. This tells Metal that a depth texture is available but is not sufficient for it to be used during rendering. We need to tell Metal to use the depth texture explicitly.

To do this, we create a depth-stencil state object. The depth-stencil state contains settings for configuring the GPU with our desired way of handling the depth buffer. There are two relevant bits of state we will consider here: enabling depth-write, and the depth comparison function.

To create a depth-stencil state, we first configure a depth-stencil descriptor:

let depthStencilDescriptor = MTLDepthStencilDescriptor()

We enable depth writing by setting isDepthWriteEnabled to true:

depthStencilDescriptor.isDepthWriteEnabled = true

We then configure the comparison function that should be used to determine whether a given fragment should overwrite the value already in the depth buffer. We already know that the depth buffer is cleared to 1.0, and than smaller values are closer, so we use MTLCompareFunction.less:

depthStencilDescriptor.depthCompareFunction = .less

Now that we have built our depth-stencil descriptor, we can hand it to a device and get back a depth-stencil state:

depthStencilState = device.makeDepthStencilState(descriptor: depthStencilDescriptor)!

When rendering, we use our depth-stencil state by setting it on the render command encoder prior to issuing any draw calls:

renderCommandEncoder.setDepthStencilState(depthStencilState)

Winding, Facing, and Culling

There are two other bits of render state that become relevant when we start drawing in 3D: front-face winding and cull mode.

Given three vertices connected into a triangle, it is ambiguous which side is the “front” without more context. Rotating the triangle 180 degrees around the y axis shows a different side, but are we looking at the front or back now? We resolve this by saying that the front face of a triangle is determined by the order in which its vertices are specified.

The choice or ordering is called the winding order, either clockwise or counterclockwise. If we use clockwise winding — the default in Metal — a triangle is facing us if its vertices are ordered in clockwise fashion from our perspective. The side that is not facing us is called the back face.

To override the default front-facing winding of clockwise, we call the setFrontFacing(_:) method on our render encoder:

renderCommandEncoder.setFrontFacing(.counterClockwise)

Now that we have a notion of front and back for triangles, we can save some processing time by telling Metal to ignore triangles that are facing away from the virtual camera. We call this process back-face culling.

We can tell Metal to cull back faces, cull front faces, or disable culling by calling setCullMode(_:) on the render command encoder:

renderCommandEncoder.setCullMode(.back)

And that’s all you need to know to get started using the depth buffer in Metal.

To visualize the 3D surface normals of the sphere, I have replaced the basic lighting fragment function from last time with the following even simpler function:

fragment float4 fragment_main(VertexOut in [[stage_in]]) {
float3 N = normalize(in.normal);
float3 color = N * float3(0.5) + float3(0.5);
return float4(color, 1);
}

Next time we will introduce perspective projection, an essential tool for creating a more convincing illusion of depth in larger 3D scenes.

--

--

Warren Moore

Real-time graphics engineer based in San Francisco, CA.