Math for Transforming 3D Geometry
Whether you’re writing your own game engine, using Unity or Unreal, or you’re an artist making 3D graphics, a basic understanding of the kinds of math computers use to get graphics on the screen is a huge asset. This article introduces vectors, complex numbers, Euler angles, quaternions, and matrices, and how they are used to move, rotate, and scale 3D models.
The positions of things in graphics are usually represented by vectors. The first number in a vector is called the x-coordinate and the second number is called the y-coordinate. Each number represents a dimension; a 3-dimensional vector has an additional coordinate called z.
Here are two different vectors, a and b. a’s x coordinate is 2 and its y coordinate is 3. Since these vectors have two numbers each, they represent positions in 2-dimensional space.
Vectors can be added together or subtracted. a’s x coordinate is added to b’s x coordinate, and a’s y coordinate is added to b’s y coordinate. Likewise for subtraction.
We can think of a 2D vector as a distance from the origin (0,0) on the x and y axes. Addition and subtraction move one vector by another vector.
When talking about vectors, we call a regular real number a scalar. In this example, a scalar, s, is equal to 5. Multiplying a vector by a scalar multiplies each coordinate by the scalar. However, there is no straightforward multiplication of one vector with another. We won’t need the dot product or cross product in this tutorial, so we won’t cover them here.
The length of a vector is its distance from the origin. If c is a vector, the length of c is notated by |c|. Suppose the units of our space are measured in inches. You could find the length of a vector with a ruler by measuring the distance from the origin to the point represented by the vector. Mathematically, the length is found using the Pythagorean theorem.
A unit vector is a vector with a length of 1. A vector may represent a position, but it also represents a direction. In this example, the vectors m, n, o, p, and q are unit vectors and therefor lie on the unit circle, representing an angle.
Most 3D graphics are made up of triangles. A triangle is defined by three vertices, and a vertex’s position is simply a vector.
This is Suzanne, a simple example model that comes with Blender. By putting together many triangles that share vertices, we can create a 3D model.
In order to transform 3D graphics, we need to transform triangles. We do this by transforming every vertex that makes up the triangle. If we apply the same transform to every vertex, we can move, scale, or rotate the triangle without deforming it. To move around models in a 3D world we need to apply a transformation to every triangle that makes up the model.
When an artist creates a model in a program like Blender, the model is in model space. Character models are usually made with the origin of this space at the character’s feet. This way, if we put the origin point on the ground, the character stands on the ground. The origin here works as a pivot point: everything happened “around” the origin.
The first step of transforming the model is to scale it if necessary. For example, the character may have been modeled too big relative to the scene and we would like to scale it down. Second is rotation: what direction is the player looking in? The last step is moving the model, known as translation. We need to move the model from being near the origin to where in the world we actually want it. A game does these transformations on every object in the scene for every single frame.
So how can we do these transformations? We’ll start by looking at translation. Since the position of vertices are vectors, we can use vector addition and subtraction to move them. Given an offset vector, such as the player’s position in the world, we can move every vertex on the model by this offset in order to move the model.
Scaling is done by multiplying the vertex’s position by a scalar. Multiplying by a number above 1 makes the triangle bigger, a number between 0 and 1 makes it smaller, and multiplying by 1 clearly has no effect on the triangle. Notice how the triangles scale around the origin.
So we have translation and scaling, but what about rotation? Unfortunately there’s no simple vector math for rotating a point around the origin. The rest of this tutorial will be on rotation.
When we rotate a vertex, we want to spin it around the origin. It’s important that if the vertex was 6 inches away from the origin before being rotated, it’s still 6 inches away afterwards. If we rotate every vertex on the model, we’ll have rotated the model.
Before we get to 3D rotation, we’ll first discuss complex numbers. Complex numbers have a real part and an imaginary part. The imaginary part is multiplied by i.
We can think of a complex number like a 2D vector. The real part corresponds to the x coordinate and the imaginary part corresponds to the y coordinate. Most of the time you can think of i as just meaning “the vertical axis.”
We can add and subtract complex numbers just like we can vectors by applying straightforward algebra. This has the same effect of “moving” one complex number by another.
So what’s the deal with i? The only time you have to worry about it is when i is multiplied by another i. i times i (i squared) is equal to -1. That’s all there is to it; a single i means nothing to us besides indicating the vertical axis.
Unlike vectors, complex numbers can be multiplied. Here we see the formula for multiplying the complex numbers C and D.
There are two interesting properties of complex numbers. The first is that their lengths multiply. If the length of C is 4 and the length of D is 6, the length of C times D is 4*6 = 24. This visual shows multiplication of complex numbers with only a real part for simplicity.
The second property is that their angles add together. If A represents a point 30 degrees from the x axis and B represents a point 45 degrees from the x axis, A times B will be 75 degrees from the x axis with the lengths multiplied.
What we’re looking for here is the rotational property. We would actually like to get rid of the part where the lengths multiply. We can do this by making the complex numbers unit complex numbers. If A and B have a length of 1, A times B will also have a length of 1. This way we will only get the effect of adding their angles.
This is actually how we can derive the angle addition formula you may have used in trigonometry.
Let’s say P is a vertex we would like to rotate around the origin and r is a unit complex number that represents a pure rotation. Multiplying P with r will only rotate P without changing its length.
Actually doing the multiplication is mostly straightforward algebra. Put each complex number in parentheses and use the distributive property. Notice how we end up with a term that’s multiplied by i twice. Replace the two i’s by -1, then combine like terms. We end up with another complex number.
So far we have looked at using complex numbers for rotation in 2D space. There are a variety of practical ways to do rotations in 3D space, and we will first look at Euler rotations. In this form we break up a rotation into rotations around the x, y, and z axes. Think of the axis piercing the model. The model spins around this axis like a wheel. By combining all three, we can rotate the model in any possible way.
When a 3D model rotates around an axis, the vertices are rotating on a 2D plane. The coordinate corresponding to the axis it is rotating on is not changed, but the other two are. We can plug the two coordinates we want to change into a complex number and rotate by multiplying by a unit complex number. For example, to rotate around the x axis, we would set the real part of the complex number to y and the imaginary part to z.
There are some drawbacks to the Euler representation, however. One problem is that we can only rotate around the x, y, or z axis, making it difficult to smoothly rotate around any other axis.
Quaternions are another way of doing rotations in 3D. A quaternion allows us to rotate around any arbitrary axis, not just x, y, or z. It’s based on complex numbers and has two additional imaginary numbers, j and k. w is the real part, and xi, yj, and zk are called the vector part. Like i squared, j squared and k squared equal -1, but i is not equal to j or k. With 4 terms and some confusing rules, deriving quaternion multiplication gets pretty complicated, so we won’t cover it here. Check out the Wikipedia article for more information if you want to give it a shot yourself.
Building a quaternion is pretty easy. We’ll need to know the axis we’ll want to rotate around and the angle indicating how much we want to rotate. The axis is represented by a unit vector. This way, the axis can be any combination of x, y, and z as long as the length is 1. We multiply all three numbers in the vector by the sine of half the angle. w is just the cosine of half the angle.
On the left half of this example, we have a reminder of complex number multiplication. A point p is multiplied by a unit complex number a or b in order to rotate it in a circle around the origin. It can also be multiplied by both a and b to apply the rotations of both. The right half shows the equivalent idea for quaternions. A 3D point v is multiplied by A or B to rotate it around the origin. We can think of v moving on the surface of a sphere. If we multiply by A by B, we get a new quaternion that represents the rotations of both.
Matrices are another of the common ways to represent rotations, as well as translation and scaling. We can multiply a matrix by a vector by thinking of the vector as a matrix with one column. We can then apply matrix multiplication.
A 2x2 matrix can represent 2D rotation. We can take a complex number and directly translate it into matrix form. When this matrix is multiplied with a vector, we will do the exact same math operations as the complex number would. Quaternions can be converted to a 3x3 matrix.
Matrices can come in many sizes. A 3x3 matrix allows us to rotate a 3D vector. A 4x4 matrix is best for 3D graphics, though 4x3 matrices may be used if a bit more efficiency is needed.
The identity matrix contains 1s along the diagonal and 0s everywhere else. It has no effect on a vector when multiplied. Since the matrix has three rows, the vector would need to be 3D. In this example we’re using a 2D vector, so we set the z coordinate to 1.
A small modification to the identity matrix allows us to create a matrix that will scale a vector. By changing the 1s in the matrix to any other number, the vector’s coordinates are scaled by that number. The numbers do not all have to be the same; we could apply different scales to the x and y coordinates of vertices to squash and stretch triangles.
The purpose of using a 3x3 matrix for transforming a 2D vector is so that we can use the matrix for translation. In this example, we want to move the vertex by 5 on the x axis and 8 on the y axis. We take advantage of the 1 at the end of the vector, which is multiplied by the 5 and the 8.
We now know how to represent scaling, rotation, and translation as matrices. By multiplying them with each other, we end up with a single matrix that represents all three. This one matrix can then be applied to a vector to do all of the transformations at once.
The Graphics Card
So we have a model made up of triangles which are made up of vertices, and we have a matrix we want to use to transform the model and put it where it should be in the virtual world. This is the job of a graphics card, which applies the matrix to every vertex of the model.
One last question is how we decide which models actually get displayed on the screen.
Let’s say you want to take a picture of Mount Everest. What most people would do is get on a plane, fly to Nepal, and set up their camera in front of Mount Everest. But there’s another option: you could set your camera on a floating tri-pod, pick up everything in the universe besides your camera, and rotate the universe so that Mount Everest is in front of your camera. This is the method computer graphics use.
Everything inside the box shown here is what will be rendered to the screen. Our goal, then, is to get the things we want to see inside this box. Applications like the Unity Engine Editor and Blender give you a camera object to intuitively work with. The camera’s position and rotation can be represented by a matrix. But when it comes to implementing the math, rather than moving the camera, the camera stays in place and we move everything else in front of it. This is done by simply applying the inverse of the camera’s matrix to every model in the scene. If you want to take a picture 100 meters to the right of the origin, just move every object 100 meters to the left.
Lastly, we collapse the z-axis to make the triangles 2D. They are then filled in with pixels (the other main responsibility of the graphics card) and the image is transferred to the screen.
Where to Learn More
Being comfortable with trigonometry is incredibly useful for anyone working with 3D graphics. Dave’s Short Trig Course is a great place to brush up.
For more about vector and matrix math, I recommend this detailed tutorial.
Getting into the math of quaternions is a little more tricky. A good way to get an intuitive feel for them is to play with the w, x, y, and z values in the transform panel in Blender, or write some scripts to tweak them in your favorite engine.
I’m currently a student at Boise State University studying computer science. You can find some of my projects and other articles I’ve written on my website, mysterioussoftware.com.