Making Your First Circle Using Metal Shaders

Alex Barbulescu
Apr 17, 2019 · 19 min read

Metal Shaders? Render Pipeline? Vertex Shaders? Fragment Shaders? If you were anything like me, then these words and phrases will be meaningless or confusing. This tutorial is meant to help you get an easy footing on how it all works and allow you to build off from there.

Setup

We’re going to be starting off with a MacOS application. The reason for this is so that we can use our mac’s GPU in the simulator. If you want to do it for an iOS application you’ll have to run it on a physical device as the iOS simulators do not support Metal.

We’ll now be adding in our NSView subclass (essentially a UIView for Mac) called “MetalCircleView”. This is where we’ll be doing the heavy lifting. The application’s root view controller (called ViewController) will just be displaying this NSView.

First thing we want to do is set up our init functions. We’ll be ignoring the draw function that the class got initialized with. And instead going with our own.

We’ll now show the view from within our ViewController to the window using auto layout constraints.

Now if we run it, we should get an empty window!

Setting Up Our MetalKit View

  1. Import MetalKit into your MetalCircleView file
  2. Declare our MTKView (Metal-Kit-View) as a class instance variable
  3. Constrain it to our view
  4. Set ourselves as its delegate conforming to the MTKViewDelegate

We now have optional fields to set up for our MetalView which are discussed here in the documentation.

Telling the MTKView how/when to “update”

We need to tell our view how and when it should redraw itself, we have 3 options:

  1. We let it redraw itself based on its internal timer (continuous)
  2. We tell it when to redraw itself using a setter which will happen based on its internal timer (initiated by us)
  3. We directly tell it to draw ignoring its internal timer (initiated by us)

We’ll be going with number 2 as we’re only drawing once and will be relying on using the view’s currentRenderPassDescriptor (more on this later). As per the documentation we need to pause it usingmetalView.isPaused = true and enable its set needs display usingmetalView.enableSetNeedsDisplay = true. This tells it that it should be paused and should wait for us to tell it when it needs to display something.

Connecting it to the device’s GPU

Our MTKView needs to be connected to a device which is of type MTLDevice. You can essentially think of this device as the GPU itself.

The MTLDevice protocol defines the interface to a GPU

We can fetch the GPU at run-time using MTLCreateSystemDefaultDevice() in iOS or tvOS, and in macOS. There is another option available for fetching a specific GPU (useful if you want to target a Mac’s dedicated GPU or integrated GPU) but that’s beyond the scope of this tutorial.

We want this metal device to be available globally so we declare it as a class instance variable, initialize it in our setupMetal() function and set it as our metalView’s device.

metalDevice = MTLCreateSystemDefaultDevice()

Our final product now looks like this and we’re ready to get started with setting up our rendering functionality!

Setting up our Rendering Functionality

Creating the Command Queue

First thing we need to do is make a command queue, a MTLCommandQueue. This queue needs to be unique to our device (our GPU interface), we use it to communicate instructions to our GPU. The instructions are represented by a MTLCommandBuffer and are created for the command queue to execute.

Knowing this information, at initialization time we want to create the command queue and keep reference to it as an instance variable. Then every time we want to render something we need to create a command buffer object to hold our instructions.

Since the command queue is unique to our device, we use our device to create it! We’ll want to add this as part of the setupMetal() function.

metalCommandQueue = metalDevice.makeCommandQueue()!

(At this point you should be wondering why I’m force unwrapping. Make sure you handle your optionals properly!) After setting up the command queue our code should look like this.

Issuing Our First GPU Command!

We now have the basic setup and knowledge to issue our first GPU command, we’ll be rendering an RGBA color value to our MTKView.

First thing we want to do inside draw is to create our commandBuffer. This will contain the instructions we need to execute our commands!

Creating the pipeline

Our command buffer needs a pipeline to be fed through. The pipeline needs internal information and interface information. We use a MTLRenderPassDescriptor to configure the interface information. For this tutorial we don’t need to create our own, we can fetch the default one from the MTKView using .currentRenderPassDescriptor .

Now accessing our render pass descriptor’s colorAttachements array property we can set a value at it’s (0th entry).clearColor which describes the color data for the texture assigned to the view’s current drawable. More simply, this can be thought of as the “background color” for our metal view.

Next we need a MTLRenderCommandEncoder to configure the inside of the pipeline. It’s compiled from our commandBuffer using the renderDescriptor.

From here we can start inputing vertex data and drawing commands to be drawn on the gpu, or better thought of as “encoding” commands for the GPU to run. For now we’re not ready to encode any real drawing commands so we’ll leave that for later (I lied to you in the section title :p). We want to see that beautiful blue background color in our MTKView!

We need to do 4 things to end the encoding and fire off the commandBuffer to be executed on the gpu and displayed to our view!

  1. End the encoding.

renderEncoder.endEncoding()

2. Tell the gpu where to send the rendered result.

commandBuffer.present(view.currentDrawable!)

We use the MTKView’s currentDrawable, a drawable representing the current frame. A MTLDrawable is a “displayable resource that can be rendered or written to.

3. Add the instruction to our metalCommandQueue

commandBuffer.commit()

4. Tell our metal view to draw which triggers the draw method, we’ll be adding this at the end of our setupMetal() function but you can call it anywhere you’d like (after you’ve setup the metal components of course).

metalView.needsDisplay = true

Our draw function should now look like this.

If you hit run you should see a blue screen!

Note:

Earlier when we were choosing how to update our MTKView, I mentioned we went with setting it manually using the view’s internal timer because of our reliance on the currentRenderPassDescriptor. If we had went with issuing the draw() command manually, ignoring its timer, we would have had to call it twice as the first time the view would not have had a currentRenderDescriptor.

We now need to encode commands into the renderEncoder to let it know what to draw from vertex points passed in. We also need a way to represent this information such that we can create it in our view and the metal shader can use it properly as well. But first we need to go through a high level overview of how the GPU actually draws stuff.


Pipeline Stages

Encoding Drawing Commands / Vertex Data: The data that the GPU receives, and that must be processed in the pipeline.

Vertex Shader: Converts the 3D vertex locations into 2D screen coordinates. It also passes vertex data down the pipeline.

Tessellation: Subdivides triangles into further triangles to provide higher-quality results.

Rasterization: Discretizes the 2D geometric data into 2D discrete pixels. This will also take data attached to each vertex and interpolate it over the whole shape to every rasterized pixel.

Fragment Shader: Given the interpolated pixel data from the rasterizer, the fragment shader determines the final color of each pixel.

Full Credit for the Pipeline Stages section goes to Donald Pinckney (source)

Shaders

Metal supports 3 types of shader functions, Vertex, Fragment and Compute (kernels). These describe parts of the render pipeline.

(Source)

Vertex Shaders: A function used to manipulate the vertex points of a polygon. It runs on each vertex point we pass in. Here we can manipulate the position of the vertex points and other properties such as the color.

Ex. In the vertex shader I can manipulate each vertex position, so If I wanted to I could pass in points to make a circle then manipulate them into a square. I can also pass in a color for each vertex point and then change it inside the function as well.

Fragment Shaders: A function used to manipulate how the pixels between vertices look. It runs on each pixel between a set of vertex points. Here we can return the color information of each pixel.

Uniform Scalars: At this point you might wonder, what about passing in scalars? Let’s say a constant Float type to represent the position multiplier of our object, by changing this constant we can make our polygon bigger or smaller. Well this is called a Uniform because it’s a value that’s uniformly applied to all the points, aka it doesn’t change.

Primitives

At the lowest levels, GPUs are designed to render triangles. Triangles are the easiest and most versatile objects for it to work with and that is what today’s hardware focuses on doing (StackOverflow explanation here). That doesn’t mean we can only tell the GPU to draw triangles. For example if you’re using a framework that supports quads (rectangles) you could pass it 4 points and tell it to draw a rectangle. This makes it easier on the programmer but in reality the GPU still breaks down that instruction into 2 triangle instructions. Think of it like writing a complicated line of code in a high level language. When the code gets compiled into assembly that “one” instruction gets broken down into a series of multiple instructions that the CPU can actually execute.

MTLPrimitiveType The geometric primitive type for drawing commands.

  1. point — rasterizes a point at each vertex
  2. line — rasterizes a line between each seperate pair of vertices (makes unconnected lines)
  3. lineStrip — rasterizes a line between each pair of vertices (makes a series of connected lines)
  4. triangle — rasterizes a triangle for every seperate triplet of points
  5. triangleStrip — rasterizes a triangle for every three adjacent triplet of points

To summarize we need 3 high level steps to make a circle.

  1. Create the vertex points on the CPU
  2. Send the vertex points to the vertex shader
  3. Apply the color in the fragment shader

Setting Up Our Metal File

Now there’s multiple ways to do this. Essentially what we need is to specify a library for our render encoder to use. This metal library is constructed from .metal files. In the .metal file we can specify the shader functions. Fun fact: you can also build the library at run time from a string.

First thing we need to do is create our metal file in our project folder. This is just like adding a new file except we select “Metal” instead of “Cocoa Class” or “Swift”

Go ahead and name it CircleShader.metal

Opening it up we see that we’re importing the Metal Standard Library and using the metal namespace. The language used here is called the Metal Specification Language. If you’ve ever worked with C++ you’ll notice it already looks similar, this is because the MSL (Metal Shading Language) is based off of C++.

Creating a Data Structure to communicate our Vertex Points to the GPU

We need a common language between our swift file for our vertex points and our metal files. We need to be able to create our vertex points in swift (CPU side) then read them in Metal (GPU side). Container types for the data need to be consistent.

First let’s look at what we need to represent a vertex point, we need a position variable holding 2 coordinates and we need a color variable holding the color information for the point.

If we want to carry 2 sets of information for 1 vertex point then we’ll go with a struct.

struct VertexIn {
position : vector_float2 //<x,y>
color : vector_float4 //<R,G,B,A>
}

If we want to carry only 1 set of information then we don’t need a struct.

var verticesForCircle = [vector_float2]() //array of <x,y>

Lets say we want our circle to be a solid color, then it makes no sense to pass in a color with our vertex data as we can just hardcode it in the shader functions. For this reason we’ll be going with just a vector float rather than a struct.

You might also notice that the examples above contained vector_floats. Under the Accelerate framework, Apple uses the SIMD library for vectors. It was built for C++and is also available in Swift so we’ll use it to represent our values.

Importing simd into our .metal and .swift files

simd C++ (metal)

#include <simd/simd.h>

Declaring a vector:

vector_float2 varName;

simd Swift

import simd

Declaring a vector:

let varName : simd_float2

Using the SIMD library we ensure that our data is being represented consistently in memory across the CPU and the GPU.

Creating the Vertex Points For Our Circle

We can now create our vertex points for our circle! Our first step is to think of how the GPU draws primitives. The more triangles we render, the smoother the circle.

There’s 2 options here.

1.

Calculate all the points around the perimeter of the circle and shove in the origin point between each 2. When we only have a few triangles you can easily see that we’re really just trying to make enough triangles to hide the flat outer edges.

2.

Don’t use the origin and instead make all the vertices of the triangle touch the perimeter.

There’s no right or wrong answer here so we’ll go with the easier option (option 1)

We’re going to create an instance variable called circleVertices and a function called createVertexPoints() . Inside the createVertexPoints() function we’ll want a helper function to calculate degrees to radians as we’ll be using Swift trigonometry functions.

Our MetalCircleView class should now look like this:

Since there are 360 degrees in a circle we can make n*360 perimeter points (where n represents non-zero, positive integers) with (n*360)/2 origin points. Essentially the bigger n is the more triangles we render and the smoother the circle is. Fortunately, n=2 is good enough for us.

I’ll skimp on the trigonometry lesson but here’s how we get 720 perimeter points.

Now in between every 2 perimeter points we need to form a triangle with the origin.

It’s worth noting that the points we’re creating are normalized to the screen. In Apple’s Hello Triangle Example it’s defined as

The vertex function translates arbitrary vertex coordinates into normalized device coordinates, also known as clip-space coordinates. Clip space is a 2D coordinate system that maps the viewport area to a [-1.0, 1.0] range along both the x and y axes.

What this means is that the area we can render points in goes from -1.0 to 1.0 on both the x and y axis and that this coordinate systems maps to a viewport area. In our case, we haven’t touched the viewport area, so the viewport area is our entire MTKView.

We’re now ready to send this data to the GPU and create or shader functions :)

Setting up the shader functions

Pointers and memory

In Metal Shading Language Specification Chapter 4

Arguments to Metal graphics and kernel functions declared in a program that are pointers must be declared with the Metal device, threadgroup, threadgroup_imageblock, or constant address space attribute.

These specify what address space in the GPU the array should be stored in. device attribute specifies a read-write address space and constant specifies a read only address space.

Program scope function constants

Program scope variables declared with (or initialized with) the following attribute are function constants:[[function_constant(index)]]

These attributes are usually used on parameters to let metal know where to pass in specific data.

First I’ll show you the template and then explain what’s going on.

vertex function

const constant vector_float2 *vertexArray [[buffer(0)]]

The first parameter is us taking in our array of vertex points that we’ll be passing in. Breaking down the syntax we see we have a pointer to an array of vector floats. The vertex data, as you will see soon, needs to be passed in as “buffer data”. The [[buffer(0)]] specifies that we want the first (and our only) buffer data to be passed into this parameter. The constant attribute tells metal to store the vertex data in read-only memory space.

unsigned int vid [[vertex_id]]

The second parameter vid stands for “vector id”. This uniquely identifies which vertex we’re currently on, it will be used as the index for our vertexArray. Just as how in our vertexArray parameter we needed to let metal know that it needs to pass in, we let metal know to pass our vertex id into the vid parameter using [[vertex_id]] .

VertexOut

The output is of type VertexOut which holds a position vector and a color vector. The output first goes through tessellation/rasterization, so the [[position]] attribute tells metal to use the position field of the struct as the for the normalized screen position. You may have noticed by now that this is a 4D field instead of the 2D we pass in for position. The 3rd/4th coordinates represent depth and homogenous space, something we don’t have to worry about. That VertexOut struct will then be getting fed into the input of our fragment function from which we’ll want to use the color field.

fragment function

VertexOut interpolated [[stage_in]]

We have only one input parameter here of type VertexOut called interpolated . The [[stage_in]] attribute tells the metal that the variable should be fed in the interpolated result of the rasterizer.

The output is just an <R,G,B,A> color that we fetch from the VertexOut struct that was passed through from the vertexShader function.

Populating The Shader Functions

  1. We get the current vertex from the buffer using the vertex id
  2. We initialize the output of type VertexOut
  3. We set the output’s 4D position information with just the 2D position from our currentVertex point
  4. We return the output to be rasterized and then passed into our fragment shader
  5. In our fragment shader we just return the color

Some interesting notes:

  • If you don’t include the [[position]] attribute in the struct then you’ll get a compile error telling you that VertexOut is an invalid return type.
  • If you’re only passing out a vector_float4 with no struct, metal will automatically inference it to being the coordinates.

Optimizations Here

You may have noticed that instead of passing through the color we can just hardcode its return value in the fragmentShader itself. This is a good optimization for us (having a solid color for the circle) but it’s not a scalable solution for anything else.

Setting up our rendering pipeline

This is our last step! Hooray. Now that we have the vertex points to make the circle and the metal shaders to render it, all we have to do is to use our metal shaders as part of our pipeline and feed it in the vertex points as buffer data!

Here’s where we left off last time in our draw function in the MetalCircelView class.

We created a command buffer to be added to our commandQueue which was created for our GPU interface. We setup the input and output of the pipeline. Now all that’s left is to tie the renderEncoder (or the “inside of our pipeline”) with our shader functions and pass it in our vertex points as buffer data!

Tying in our metal functions into our renderEncoder

First step is to create a MTLRenderPipelineState .

To use MTLRenderCommandEncoder to encode commands for a rendering pass, specify a MTLRenderPipelineState object that defines the graphics state, including vertex and fragment shader functions, before issuing any draw calls.

To create pipeline state we need a MTLRenderPipelineDescriptor .

An argument of options you pass to a device to get a render pipeline state object.

So we’re going to create new class instance variable for the MTLRenderPipelineState and a function to create the MTLRenderPipelineState which we’ll call in the setupMetal() function right before making our view draw.

To create the pipeline state we need to

  1. Create the pipeline descriptor
  2. Find our metal files using the GPU Interface
  3. Tell the pipeline descriptor what our vertex and fragment functions are called
  4. Tell the pipeline descriptor in what format to store the pixel data
  5. Create the pipeline state from the pipeline descriptor

As usual, make sure to handle throws and optionals properly (do as I say not as I do :p )

Now to connect it to our render encoder all we have to do is use its setRenderPipelineState function.

We’re now ready to draw primitives from our vertex points!

Turning the vertex points into buffer data

First we need to create the buffer data which is of type MTLBuffer . The documentation on this is worth the read to understand what’s going on.

A MTLBuffer object can be used only with the MTLDevice that created it. Don’t implement this protocol yourself; instead, use the following MTLDevice methods to create MTLBufferobjects:

  1. makeBuffer(length:options:)

creates a MTLBuffer object with a new storage allocation.

2. makeBuffer(bytes:length:options:)

creates a MTLBuffer object by copying data from an existing storage allocation into a new allocation.

3. makeBuffer(bytesNoCopy:length:options:deallocator:)

creates a MTLBuffer object that reuses an existing storage allocation and does not allocate any new storage.

We want to go with 2 as we already have the data stored in our circleVertexes array

We declare our vertexBuffer at the top as an instance variable

private var vertexBuffer : MTLBuffer!

and then populate it inside of our setupMetal() function

vertexBuffer = metalDevice.makeBuffer(bytes: circleVertices, length: circleVertices.count * MemoryLayout<simd_float2>.stride, options: [])!

The makeBuffer function takes “length” number of bytes from our circleVertices and stores it into GPU/CPU accessible memory. For the length we get the stride (The number of bytes from the start of one instance of T to the start of the next when stored in contiguous memory or in an Array<T> ) from the MemoryLayout of the data’s type (in our case a simd_float2) and multiply it by the number of entries of that type we have in the array.

Tying it all together we’re left with this

Drawing our first primitive (the circle!)

We’re almost there! We have everything in place to issue our draw command on the render encoder. Here is where the documentation for the MTLRenderCommandEncoder really becomes important. There’s 2 notable sections

  1. Specifying Resources For A Vertex Function (Buffer data)

func setVertexBuffer(MTLBuffer?, offset: Int, index: Int)

Sets a buffer for the vertex function.

Remember how we used the [[buffer(some index)]] attribute for our vertexArray parameter in our vertex shader function? Well in our draw function we can set the vertexBuffer at a specific index such that metal knows which input parameter to pass it to.

renderEncoder.setVertexBuffer(vertexBuffer, offset: 0, index: 0)

Here setting the index to 0 corresponds to the [[buffer(0)]] attribute. The offset specifies the starting point of our buffer data that we want to assign to that index. Since we care about all of our vertex points we set the offset to 0.

2. Drawing Geometric Primitives

func drawPrimitives(type: MTLPrimitiveType, vertexStart: Int, vertexCount: Int)

Encodes a command to render one instance of primitives using vertex data in contiguous array elements.

This is what triggers our vertexShader function to run. Everything we’ve done so far has been for this moment. We tell our render Encoder to draw a specific primitive (remember when we went over the MTLPrimitiveTypes), what vertex to start from and the vertexCount.

You may be wondering why we need to specify vertexStart point and vertexCount point. This is needed for when you want to create different primitive types in the same render pass. If your first 1000 vertexes are for triangles and the next 1000 are for lines, you will want to specify from what vertex does the next primitive type start.

renderEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 1080)

We have 1080 vertex points and we want to render triangles from the very first point.

Finally our draw function should look like this

All we need to do is press run and we should see our first circle!

Wait a minute…. Now that doesn’t look like a complete circle. We can clearly see the distinction between the triangles from the origin. Not to mention these weird artifacts from the rendering. Looks like our original idea for the triangles doesn’t make sense.

Let’s look back at the triangle primitive options, we have 2:

  1. triangle — rasterizes a triangle for every seperate triplet of points
  2. triangleStrip — rasterizes a triangle for every three adjacent triplet of points

What if we change the the primitive type from triangle to triangleStrip?

We now have a full circle, hooray! We’ve essentially closed the gaps by drawing more triangles with the points we’ve created.

A visual representation of how the gaps were filled with more triangles using another color

To tie it all together our MetalCircleView class should look like this:

Full source code on my github here


Remaining Question: The Viewport

At this point you should be wondering why the circle scales and stretches with the window. Keep in mind we’ve constrained our metal view to our window, so changing that around stretches our “normalized” 2D coordinate space. If you remember in the “Creating our Vertex Points” section, we saw that the normalized coordinate space is mapped to our MTLViewPort .

There’s 2 ways to handle this:

  1. Constrain the MTKView such that it’s width == height (either ratio or hardcoded value)
  2. Set the viewport on the renderEncoder in the draw function

Which leads perfectly into the last section of this tutorial :)

Where To Go From Here

We’ve just created our very first circle in metal! We learned how to use the basics of metal (setting up our rendering pipeline), use a shading language (The Metal Shading Language), how a GPU draws, and drawing our first primitives to make a circle!

The next steps I would suggest are:

  1. Passing more fields into the vertexArray in the metal function. Think back to when we chose to represent our vertices using only one field. Try passing in the vertices as a struct with a color field as well.
  2. Pass in buffer data to the fragment shader function.
  3. Drawing more shapes in one render pass.
  4. Setting the view port area upon the drawableSizeWillChange delegate method of the MTKView by making the view redraw itself.

I hope you’ve enjoyed this not-so-brief introduction into metal :). The complete project can be found on my GitHub page here. I’m planning an extension to this (more metal shaders) and signal processing using the Accelerate framework!

If this helped you out feel free to leave a clap and a star!

Alex Barbulescu

Written by

Creating experiences in iOS | alexs.ca

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade