Would you like some Shaders?

Alexey Tukalo
Arction Ltd
Published in
5 min readSep 8, 2017

--

Shaders are an essential part of our workflow in Arction. They are used to achieve outstanding performance by delegating the computations to the Graphics Card. They also enable us to implement custom visualization algorithms on a close-to-hardware level.

Paletted Wire-frame rendering of 3D MeshModel with LightningChart

GPU is an example of an extremely parallel hardware. Efficient utilization of its power requires parallel data computation. Computation Pipeline is a very good approach for this scenario. It can be used in a situation when the parallel computation is applied to some specific problem. Then the case can be generalized and broken into steps with certain sets of valid input and output parameters. In such pipelines data travels from stage to stage sequentially. Rigid data flow allows performing automatic parallelization of the computation process efficiently.

Compression of CPU and GPU cores

There are programmable and unprogrammable parts of computation pipeline. Programmable steps are represented by some kind of pure functions which have access to the isolated part of the entire dataset provided as an input. In addition, they can access a set of the immutable objects which is supplied at an initialization step. Unprogrammable stages implement a part of logic which is generic for the pipeline. For example, they manage data distribution across computational units.

Example of computational pipeline for parallel applications

The pipeline approach is very common nowadays. Pipelines based on a combination of chained higher-order functions are used in a number of tools like: Apache Spark, Apache Storm, Apache Kafka(there are too many wigwams in the village), Akka Streams, ReactiveX and many others. They use pipelining for a wide range of tasks from asynchronous data streams and distributed event based systems up to building highly scalable, fault tolerant real-time data processing systems. Apache Hadoop (MapReduce) uses it to perform Big Data analysis on computer clusters. They break computation into three steps: Map (sorting, filtering), Shuffle (automatic redistribution of a data set according to Map results) and Reduce (summary operation). Pipelines based systems have some design limitation which makes them a very good abstraction for encapsulation of concurrent or parallel operations. Graphic Pipeline represents an application of the concept for GPU-accelerated computer graphics.

Example of computational pipeline for concurrent applications

Graphic Pipeline

Every Graphic Library has its own implementation of Graphic Pipeline. Their common feature is the fixed order of computational stages. The unprogrammable part of the pipeline is called fixed functions. Programmable parts are called Shaders. In Arction, we are using DirectX 9 and 11, so some of the terminology used further in the article can be specific for their implementation of Graphic Pipeline.

The computational pipeline of modern Graphic Library contains four types of Shaders: Vertex, Tessellation, Geometry and Pixel (Fragment). Shaders are usually implemented in specially designed C-like languages with a very simple syntax. The main aim of the languages is clear and efficient declaration of computations which are needed for realistic 3D space visualization.

Graphics Pipeline receives a collection of vertices as an input. For a successful invocation, it also needs to set options for fixed functions, code of correspondent shaders and constant values for them. An output of the pipeline processing is displayed on the screen or returned as a texture.

DirectX 11 rendering pipeline

Types of Shaders

Vertex Shader

At this stage, vertices are processed one by one. The process includes operations which have to be performed at every vertex separately. Typically, these are just intermediate values which are needed in future stages. Rotation, scaling and positioning of the vertex at the scene are a good example of such operations. Usually, it is done by multiplication of vertex’ world coordinates by three matrices which hold the information about the vertices position, camera position and perspective of the world. Vertex Shaders are executed rarer than other types of shaders, so it is faster to perform there as much work as possible. Thereby, sometimes it also can be used for more sophisticated tasks like lighting calculation. This kind of decisions has a positive impact on performance but can cause unpleasant artifacts on a final picture, that is why this approach is not a good idea for a major part of 3D applications.

Tessellation

It is a very new feature of Graphics Pipeline, which is used for smoothing the geometrical shapes by dividing the primitives into smaller parts. Tessellation stage contains two shaders and one fixed function. The sub-stages are represented by Hull Shader, Tessellator and Domain Shader. To achieve desirable results, all three of them should be utilized together. Tessellation is a very advanced topic which requires deeper discussion, so it is not fully covered in the article.

Geometry Shader

It is also quite a new stage of the rendering pipeline, which is not required for simple 3D applications, but it opens unique possibilities for advanced cases. In contrast to Vertex Shader, it allows performing operations over complete geometry primitives represented by several vertices. In other words, it can modify, add and remove vertices from the pipeline. The abilities allow it to convert input geometry to output one. As a result, a single vertex can be replaced by a line, triangle or even a complex shape constructed from multiple polygons. Great possibilities for drawing things like grass, leaves, rain, sprays and other particle-based effects are opened by the feature.

Pixel Shader

Pixel or Fragment shader is the last programmable part of the rendering pipeline. It is executed after rasterization of geometry, and receives specific information for each pixel of each polygon. The values are calculated by interpolation of data from correspondent vertices of the triangle which covers the area. It is the best place for accurate calculation of a pixel’s color. Usually, it is used for precise lighting calculation, but sometimes it is also utilized for image processing. Due to a large number of pixels in modern screens, a part of computation specific for every pixel tends to be the most expensive one. In addition to color, an output of the stage can also contain the depth of the corresponding point of the triangle. The depth value is needed for accurate compilation of the final image at Output Merge.

Summary

Shaders are subprograms, which contain a part of application delegated to the GPU. There are different types of shaders for different goals. All of them have certain design limitations, which allow them to handle their job efficiently. Mainly they are used for acceleration of real-time computer graphics by delegation of the most computationally intensive operations to the graphics accelerator. Graphics card is specifically designed for handling such operations in parallel. They are able to perform enormous amount of computation, therefore, their utilization for more general tasks is one of the main trends today. But this is another story…

--

--

Alexey Tukalo
Arction Ltd

Young language agnostic software developer interested in functional programming, software design, web development and computer graphics.