INTRODUCTION TO 3D COMPUTER GRAPHICS

Adityachoubey
The ACM Manipal Blog
10 min readJul 16, 2021

3D Computer Graphics(CGI), is graphics that uses a three-dimensional representation of geometric data that is stored in the computer and produces(known as “rendering”) 2D images. A 3D model can be displayed visually as a two-dimensional image through a process called 3D rendering.

FIGURE 1 : A RAYTRACED SCENE WITH AREA LIGHT. FIGURE 2 : A UTAH TEAPOT RENDERED USING RASTERIZATION AND PERSPECTIVE PROJECTION.

RENDERING AND RENDERERS:

Essentially, a 3D virtual world can be converted into a 2D image(or a bunch of 2D images) by 2 ways : light transport simulation, and rasterization. Light transport simulation can also be divided into categories, but we will focus on Raytracing for light transport simulation.

1. Raytracing — Some readers might already be familiar with this term if they have been following video games recently. Raytracing, or Path-tracing, are techniques to generate “physics accurate” 2D representations of a 3D scene. The technique is very simple. Physicists have known about the laws of reflection and refraction(and other laws as well) for a long time. They also know how humans perceive colors — when light scattered by an object hits the eye, the object’s color is perceived. Thus, all we must do to produce a 2D scene is to throw a bunch of rays into the scene and see what hits the eye(or camera in this scenario)! The only difference here is, we trace the light rays backwards. That means, the rays shoot off from the eye(or camera) through each pixel into the scene. The rays are then tested for collision with each 3D object in the scene. If the ray does hit something, it is scattered according to the properties of that object. This procedure is repeated till desired precision is reached. The accuracy of the render depends upon how many rays are traced in the world. The mathematics behind this technique is explained in detail in the section, “Monte-Carlo Raytracing” below. Raytracing has been the backbone of almost every 3D animated film. The first paper on Raytracing was published in June 1980 by Turner Whitted. But it has been adopted by video games very recently. The reason is that while this technique gives a very physics accurate image, it is very slow. Figure 1(original dimensions 800x800) took approximately 12 hours to produce, even though I implemented multithreading! While this number can be brought down by using parallel computing on the GPU, it was not possible to have Raytracing working in real-time, something that was absolutely required for a practical implementation in video games.

*Raytracing in One Weekend

  • Here is my implementation of the above Raytracer in C++, with multithreaded rendering on the CPU : CPP-Raytracer
FIGURE 3 : HOW LIGHT IS TRACED IN A RAYTRACING APPLICATION.

2. Rasterization — This is the technique that has supported more than 5 generations of video games. Rasterization is the task of taking a vector image(images defined as points in a Cartesian system) and converting it into a raster image(a series of pixels). To produce the vector image though, we must convert the scene which is in a three-dimensional space to other spaces. To be more precise, the scene is converted to an “image space.” This technique is way faster, as we do not have to simulate the properties of light accurately. For context, the teapot scene from Figure 2 was running in real-time. The lighting of the scene is generated by techniques like Phong shading, Gouraud shading, Toon shading, etc. In the industry, rendering engines are designed to support what are called Graphics APIs, which basically help us use the GPU’s capabilities. The major APIs are OpenGL, Vulkan, DirectX11/12, and Apple’s Metal. I have explained the mathematics for the transformation process of going from 3D Cartesian space to a 2D image space in the section, “Transforms and Projections”. Since this topic is too complex to be explained in detail in a blog, I think it is better if I recommend other resources that explain this topic thoroughly, with examples. This method is commonly used with other concepts like Z-buffer/Depth buffer.

*Rasterization : A Practical Implementation

*Other Resources

  • Here is the 3D Graphics Engine and Game Engine I have been working on using Microsoft’s Win32 API and DirectX11 API : Simple-Game-Engine
FIGURE 4 : CREATING A RASTER IMAGE FROM VECTOR IMAGES.

MONTE-CARLO RAYTRACING

The colors we perceive belong to the visible region of the electro-magnetic waves. Thus, it is a continuous spectrum. How do we represent a continuous spectrum in a computer? There are many answers. I will take the simplest one, RGB or Red-Green-Blue. This is how the pixels on our screen represent different shades. But how do we calculate these RGB values when we only have three dimensional objects? This is where Monte-Carlo method of integration comes in.

1. Monte-Carlo method — The under-lying concept is very intuitive. Let’s say we have to find the average of a large amount of data, like the average height of a population. It is almost impossible to get the height of every person and it is even harder to compute the average of so many numbers. Thus, we must find an approximation. The simplest method to do this is to use randomness. Over a large amount of data, we randomly select data and use them to find the sum, and consequently the approximation of the average. Below is how an approximation of an average is calculated, by randomly selecting N amount of data. Selecting data randomly is also known as “Sampling.”

Approximation(Average(X))=(∑Xn)/N, where n goes from 1 to N.

To explain Monte-Carlo better, here is example code to calculate the area of a unit circle. It is a very impractical example, but it gets the job done. First we define a uniform probability distribution function(PDF) to generate random numbers. These numbers are then used to sample random points in a 1x1 square. If the distance(`l`) is less than 1, it falls under the quarter circle, so the hit counter is increased. The area of the quarter circle is approximated as `fraction of total number of hits`. Increasing N would increase accuracy, but also damage the performance.

1. Monte-Carlo Raytracing — How is this used to calculate the colour of each pixel? Using the concepts of Monte-Carlo integration and average, we calculate an approximation of the colour at each pixel. We shoot rays( represented as r = O + td, where O is the origin of ray and d is the direction of the ray. The parameter t gives the separation of a point on the ray from the origin.) from the camera’s position, through each pixel(the position of every pixel is calculated using simple math explained in my Raytracer.). This ray is then tested for collision with all objects after which the colour of this object is fetched and then the ray is further processed according to the object’s material. Here is a code snippet from my raytracer. The below code is part of the recursive function that is called for each ray(function name is `ray_color`).

The below snippet calls `ray_color(…)` ns times for each pixel, denoted by (i, j). ns is called the sampling frequency. `ray_color` also has a term called `depth`. This term is used to specify the maximum number of times the ray can be scattered.

Here is how the ray is generated in the function `cam.get_ray(float u, float v).

FIGURE 5 : CASTING RAYS THROUGH EACH PIXEL.

The next section in Raytracing is the actual calculation of the scattered ray. To begin with, let’s take Metallic reflection. This is the easiest one to code, as we have already learned how vectors are reflected, shown below.

FIGURE 6 : SPECULAR/METALLIC REFLECTION

The above function is called in the object’s material’s `scatter function` if the material is metallic.

The next type of scattering would be observed in transparent/translucent objects. But, we also have to take Reflectance in the equation. Reflectance varies with the angle too. This can be approximated by Schlick’s approximation, shown below.

FIGURE 7 : REFRACTION

Both of these functions are used to decide the scattered ray in `MAT_Dielectric` scatter function, shown below.

The last scatter function to be discussed is that of MAT_Diffuse material. Objects with Diffuse properties do not perfectly reflect like metals. They have imperfections on the surface which prevent specular reflection, which we simulate by randomly scattering the incident ray.

FIGURE 8: DIFFUSE REFLECTION.

As we saw, different scatter functions change how the surface of an object is perceived and interacts with the environment. Other scatter functions can be coded so that the object behaves like volumetric bodies(for example, fog), but they are too complex for an introductory blog.

TRANSFORMS AND PROJECTIONS

I wanted to also give some introduction on how transforms and projections work. This is because while the above method works and generates visually pleasing images, it is not the fastest way. For practical purposes, we need to somehow project arbitrary surfaces and objects from the 3D Cartesian space, to a 2D space. How would we take a point, say P(x,y,z), and convert it to some point P’(x’,y’,0){z is 0 here because we must project all points into a plane which represents the viewport}? For this, we use transforms, and projection transforms to be precise.

FIGURE 9: ORTHOGRAPHIC PROJECTION Po
FIGURE 10: PERSPECTIVE PROJECTION Pp, TRANSFORMS A FRUSTUM TO A UNIT CUBE, THUS GIVING A REALISTIC PROJECTION OF OBJECTS

Essentially, transforms are matrices that are multiplied to a point matrix {[x,y,z,w]} to modify it. These modifications could include Translation, Rotation, and Scaling. Using these transforms, we have to get from what is called the “world space” to a space what is called the “image space”. The image space is shown in Figure 10. The matrices used for Orthographic and Perspective projection are shown below.

By multiplying either of these projection matrices to each vertex, we get the image space coordinates, which are the fed to the rest of the rendering pipeline. The Utah teapot in Figure 2 has been rendered with a perspective projection. The vertex shader code is shown below.

I won’t go into too much detail in this code, as it would require me to explain the rendering pipeline in it’s entirety. If we see `vsmain` we can get an idea how each vertex is processed. First the vertex is taken from its own local space{also called frame} known as object space, to the global world space. Then, the vertex is taken from the world space to an auxiliary space called “View Space.” View space or Camera Space, is the frame relative to the camera. This means the world is transformed such that the camera lies at the origin of the new space. This way, all objects are rendered from this point of view. Finally, the vertex is projected onto the near plane of the frustum, defined by the camera in its properties. This, obviously, is only an overview of how a scene can be rendered in real-time. GPUs have specialized systems in place for quick matrix multiplication, so this method works flawlessly. However, one has to also take in consideration that it is only half the story. This gives quick images but are unlit. To produce lit scenes, one has to implement shading techniques that act as quick hacks to approximate real lighting. One good model is Phong shading, which is shown below. A practical implementation of Phong model can be seen in the GitHub repository “Simple-Game-Engine”, linked below[FIGURE 12].

FIGURE 11 : PHONG SHADING
FIGURE 12 : A PRACTICAL IMPLEMENTATION OF PHONG SHADING.

For a good introduction to all algorithms and methods, one can refer to “Real-Time Rendering, 4th Edition” by Eric Haines et al. For a quick introduction to shading models, one can refer to Chapter 5 “Shading Basics” and Chapter 9 “Physically based Shading.”

THE SIGNIFICANCE OF THIS FIELD

The world of Computer Graphics is an interesting one. Unknown to most, research in Computer Graphics has been the backbone of many other fields, like Computer Vision, Robotics, and Augmented and Virtual Reality. It has also given us many video games and blockbuster movies(like Avatar). It has also been of a lot of use in Medical Imagery, Automobile industry, and other fields. Flight simulators are used to train pilot for extreme conditions, while surgical simulators are used to train novice surgeons without endangering patients.

I hope this blog helped people, especially students, get some idea about Computer generated graphics. The concepts explained above are easy to start with, and also show the importance of Mathematics and Physics.

BIBLIOGRAPHY

· “Raytracing : In One Weekend”, “Raytracing : The Next Week”, and “Raytracing : The Rest of Your Life” by Peter Shirley.

· Scratchapixel.com

· “Real-time Rendering, 4th Edition” by Eric Haines, et al.

· “Practical Rendering & Computation with Direct3D 11” by Jason Zink, Matt Pettineo, and Jack Hoxley.

· GitHub Repositories :

1. Raytracer in C++

2. Game Engine using DirectX11

--

--