The “Business card raytracer” in python, part 2

Now with animation!

Dmitrii (Dim) Borisevich
7 min readDec 29, 2021

Ten years ago, Fabien Sanglard wrote an article about the “business card raytracer” — a very short C++ code that generates a full raytraced image with advanced effects. I remade the code in python, and it worked comparably fast, although it wasn’t short anymore. Now, I could play with the code and tweak it. So I decided to take it one step further and create not just a picture but an entire animation.

In part 1, I explained the original code and my first porting attempt. In this part, I will describe how I made this animation. In the end, it is a single geometric formula, but getting to it is hard. So we will talk about the math used by the code. Namely, we will:

  • First, understand how the original code defined the location of the spheres in 3D space,
  • Then, use trigonometry to derive a formula for rotating the spheres in 3D with each frame,
  • Finally, patch the code to implement the rotation formula, and make use of OpenCV to show the results,

In the end, I will share a link to the code so you can play with it too!

Why did you do it, anyway?

Because it is fun to take a toy and disassemble it to see how it works. And now, when I understand how it works,

it was twice as fun tweaking the toy to change what it does.

In the previous episode

Let me tell you what we are dealing with. We have a code in python that renders an image of a logo, consisting of spheres, with the floor with the sky in the background. The code uses ray casting (and dirty hacks).

See part 1 to read more about how the code works. Briefly, we cast rays from the “camera” through each pixel of a virtual 2D “screen”. Each ray either hits the floor, the sky, or a sphere. The code finds what is hit and renders a pixel of the floor or sky color if one of them is hit. If the ray hits a sphere, the ray is bounced, and the ray casting is continued until the sky or the floor is hit.

How does the code know when the ray hits a sphere?

The position of the spheres is fixed in the original code on a 2D plane. The code defines a variable G, which is a 2D byte array, describing a mask of where spheres must be located on some imaginary 2D plane:

Each number in G (for example, 247570) represents a horizontal line, bottom-up. When converted to binary (111100011100010010), it represents a sequence of ones (shown in green) and zeros (omitted). Spheres will be plotted on the 2D flat surface where “1” is present. Notice how the ones in the code look like “aek” letters, shown on the screenshot.

But our final image is definitely 3-dimensional. How does this 2D mask define the location of the spheres in our imaginary 3D space? Where is this 2D plane located?

After some investigation, I found that coordinates were hardcoded inside the ‘tracer’ function. This function casts a ray and checks whether the ray hits any sphere. Its code implements the formula from here — see page 7, geometric proficiency is required (I also included a PowerPoint in GitHub repo, explaining how it works, link at the bottom):

Here is how it works. First, the code checks every bit in mask G, iterating over indices k and j (lines 39–41). As mentioned above, it must equal “1” for a sphere to be plotted — or, in python terms, it must be True (line 41). If this bit is True, then there is a sphere on the 2D plane. The calculation in line 42 shows that this sphere will be placed at the 3D coordinates (k, 0, j+4). The sphere radius is always set to 1, and its square, which is also 1, is used later in line 44.

How does this look in 3D?

As you can see, Y-coordinate is 0 for all the spheres. This defines the 2D surface, we were searching for. All the spheres are rendered on the imaginary 2D plane located at Y=0, while X and Z axes define the column and the row for each sphere, respectively (starting at 0 for columns and at 4 for rows, and increasing by 1 for each row/column):

Orientation of the axes in the original code. All the spheres are located at a flat 2D surface Y=0, while X and Z positions are defined by bits in the 2D mask G. Line starting at (0,0,4) passes through the centers of the lowest row of the spheres. Z-axis passes through the centers of the first column of the spheres.

Now we know which coordinates the spheres have in 3D space. But how do we alter these coordinates on each frame to have an animation?

Making animation

My plan to animate the generated image was to make it rotate somewhere around the middle on the X-axis:

First, I introduced some boilerplate code that would generate not one image but 120 identical images.

Next, to generate each frame, I had to replace the static coordinates (k, 0, j+4) of each sphere (k, j) for each of the 120 frames with a dynamic 3D coordinate defined by the frame number. Let’s derive a formula for these coordinates!

The math behind the new coordinates

Let’s set a new coordinate system (X’, Y’, Z’). All axes are parallel to the original ones, but they are placed in such a way that Z’ is the rotation axis:

If we look top-down on the picture, that’s what we would see on different frames:

Frame 0 is the original picture. Frame N is rotated by 3xN degrees.

Let’s look at a single sphere whose center is r units away from the rotation center on frame 0. For the frame N, we would have to change the X’ and Y’ coordinates of this sphere from (r, 0) to (r x cos(alpha), r x sin(alpha)). So, in the original coordinate system (X, Y, Z), the coordinates of this sphere on the frame N would be:

x = rotation center x + r x cos(alpha)
y = rotation center y+ r x cos(alpha)

These are the coordinates we were looking for.

The code

Now we can calculate the new coordinates in our code:

Adding rotation

This code, as before, iterates over the 2D mask G (lines 47–49). But now, we generate 3D coordinates for each sphere dynamically. On lines 45–46, we define two out of three coordinates, and thus we define a line in 3D space, which will be our rotation axis. Its y coordinate is 0, while x coordinate points approximately to the middle of all the spheres on the X-axis. Then, for each present sphere, we find the abovementioned distance r on line 50. Finally, we calculate its position for the given frame using the sin and cos of the angle corresponding to the frame on lines 51–53.

Sidenote: You might notice a magical constant of 0.55 that appears and adjusts the Z-axis coordinate. It comes up because of other changes I have introduced behind the scene to change the image's scale, and you can ignore it.

The result

After waiting for about 6.5 minutes, I have finally got my 120 different frames. All that was left was to add some boilerplate code that would render them one after another. I used OpenCV for python and its method cv2.imshow() to render the images cyclically:

, and it worked!

Finally, to wrap it up, I have polished and annotated the code with docstrings, comments, and type hints. The resulting code is more than 18500 symbols. This would not fit on a business card, unfortunately. But hey, now you can actually read and understand the code, and you don’t have to be a professional C++ developer to do so!

What can I do with all of this?

You can find the code on my GitHub, download it, run it, and play with it. The code should be possible to follow even if you are a junior python developer. Though if you would like to understand the entire algorithm, more experience would help.

You can change the logo, generate an animation with your own initials, and have fun because you too understood how to disassemble a toy and change what it does.

You can tweak the code more and improve what it does even further. Like I did by adding animation capability. How about colored spheres, for example?

Or maybe you just enjoyed reading this post and learned something new about the math of 3D rendering.

Finally, there is also a PowerPoint presentation explaining how the ‘tracer’ code works if you are ready to go through a challenging sketch of the proof.

Connect on Twitter (siberianpython) for more tech threads and posts, or connect on LinkedIn.

--

--