What’s Metal Shading Language (MSL)?
Processing executed in parallel on the GPU can use a Shading language called Metal’s original Metal Shading Language (MSL).
The Metal shading language is a unified language that allows tighter integration between the graphics and compute programs.
MSL Filename Extension
A MSL filename extension is .metal
. You can create a new metal file using the metal file template in Xcode.
Using .metal
extension allows Xcode to recognize MSL files in your project, automatically build a default library at build time, and help you profile and debug source code with specialized Metal tools.
The built default library (default.metallib
) is added to the application bundle. You can get the default library as follows:
let library = device.makeDefaultLibrary()// makeDefaultLibrary() is equivalent to:let filePath = Bundle.main.path(forResource: "default", ofType: "metallib")!
let library = try! device.makeLibrary(filepath: filePath)
Metal files can be built manually as follows: (Check 【Building a Library with Metal's Command-Line Tools】 for details)
xcrun -sdk macosx metal -c MyLibrary.metal -o MyLibrary.air
xcrun -sdk macosx metallib MyLibrary.air -o MyLibrary.metallib
Metal Graphics Rendering Pipeline
A render pipeline processes drawing commands and writes data into a render pass’s targets. A render pipeline has many stages, some programmed using shaders. Vertex Function and Fragment Function in rendering pipeline need to be written MSL. The main stages (from Primitives to Fragment Function) are as following flow:
- A vertex group of primitives that are basic figures composed of vertices such as points, lines, triangles, etc. are passed to the vertex shader.
- Vertex shader performs calculations such as coordinate transformation. The output vertex group is passed to the Rasterization.
- Rasterization rasterizes vertices and passes raster-data to the Fragment shader
- The fragment shader determines the color of each pixel.
Vertex Function and Fragment Function are described in detail in the Function section.
Metal Coordinate Systems
Metal defines several standard coordinate systems to represent transformed graphics data at different stages along the rendering pipeline.
Clip-space coordinates
A vertex shader generates positions in clip-space coordinates. A 3D point in clip space coordinates is specified by a 4D homogeneous vector (x, y, z, w).
Metal divides the x
, y
, and z
values by w
to convert clip-space coordinates into normalized device coordinates. it’s called perspective division. The following equality defines the relationship between the normalized device coordinates and clip coordinates.
Normalized device coordinates (NDC)
(x, y, z)
indicated by the normalized device coordinate system is in the range of −1 to 1. Positive-z values point away from the camera. NDC use a left-handed coordinate system and map to positions in the viewport.
Viewport coordinates
The rasterizer stage transforms NDC into viewport coordinates. A viewport is an area displayed on the screen. The (x,y)
coordinates in this space are measured in pixels, with the origin in the top-left corner of the viewport and positive values going to the right and down.
Texture coordinates
Texture coordinates indicate floating-point positions that map locations on a texture image to locations on the geometric surface is represented as 2D or 3D vectors. Texture coordinates can also be specified using normalized texture coordinates. For 2D textures, normalized texture coordinates are values from 0.0 to 1.0 in both x and y directions.
Function
Metal supports the following function attributes that specify how to use a function: vertex, fragment, and kernel. These function attributes are used at the start of a function, before its return type.
vertex void my_vertex_func(…) {…}
fragment void my_fragment_func(…) {…}
kernel void my_kernel(…) {…}
graphics function
- vertex: Metal executes a vertex function for each vertex in the vertex stream and generates per-vertex output.
- fragment: Metal executes a fragment function for each fragment in the fragment stream and their associated data and generates per-fragment output.
compute function
- kernel: A compute function called a “kernel” is a data-parallel function that is executed over a 1-, 2-, or 3D grid.
Host Name Attribute
Starting from Metal 2.2, the [[hostname(name)]]
attribute may be used with vertex, fragment, and kernel functions to override the default name of the function. Two distinct functions cannot have the same host name or else the compiler will raise a compile-time error.
[[host_name("foo")]] kernel void foo() {} //Metal API name is foo [[host_name("foo2")]] kernel void foo() {} // Metal API name is foo2
Templated functions
Starting from Metal 2.2 it is now possible to define C ++like templates for vertices, fragments, and kernel functions. Since these cannot be called directly in the shader, users must explicitly instantiate the template to force the compiler to emit code for a given specialization.
template<typename T>
kernel void bar(device T *x) { … } // Explicit specialization of `bar` with [T = int]
template kernel void bar(device int *);
Attribute Qualifier for Arguments and Variables
Attribute Qualifier for function arguments and values are as follows:
[[position]]
: The type is float4 and indicates coordinates (x, y, z, w).[[vertex_id]]
:ushort
oruint
to indicate the vertex index.[[stage_in]]
: It is used as an argument of the fragment shader and has fragment data that is the structure used to draw one pixel of the display.[[buffer(index)]]
: it specifies the buffer locations for the function arguments. A vertex function can read per-vertex inputs by indexing into a buffer(s) passed as arguments to the vertex function using the vertex and instance IDs.[[texture(index)]]
: Textures (including texture buffers).[[sampler(index)]]
: Samples that define how to access texture data.[[threadpositionin_grid]]
: The position of the thread in the grid.
Address Space Qualifiers
Address Space Qualifiers is used to specify the memory area for variables and arguments and must be specified for pointer type and reference type arguments and variables.
device
and constant
can be specified in graphics function.
device
: read / writeconstant
: read-only
threadgroup
and thread
can be specified in compute function.
threadgroup
: The entire thread in the thread group is shared.thread
: Can’t be referenced from other threads.