Coding a 3D ray-tracing graphics engine using Python and C++ (Part 2)

5 min readMar 17, 2023

Bouncing ball investigation

In Part 1, we built a basic but perfectly functional ray-tracing engine in Python.

In part 2, we have a number of goals:

Correct some inaccuracies
Add more shapes
Increase performance by 300x
Make VIDEO

Correcting exit rays

Previously, rays that hit nothing return a background colour. In reality, we should be calculating how much background light a ray “sees”. If a ray is pointed at the sky, for instance, it might return a bright, light blue. If it is pointed at the horizon it might return a darker blue.

Reflective and transparent objects appear much more realistic when exit rays collect light

New shapes

We prototyped using one primitive object type: spheres. But we left ourselves room to add other shapes by establishing a model in which functions that calculate intersection and containment are encapsulated by the object in question. This means we can easily add new objects and, thanks to polymorphism, we can expect everything else to continue working perfectly!

For instance, discs and pipes have very different intersection functions. Because each is an infinitely thin surface there can be no containment.

class Disc():
    def __init__(self, centre, radius, normal, material=Material(), colour=Colour(128, 128, 128), id=0):
        self.id = id
        self.centre = centre
        self.radius = radius
        self.radius2 = radius**2
        self.normal = normal.normalise()
        self.material = material
        self.colour = colour

    def describe(self):
        print(f"centre:")
        self.centre.describe()
        print(f"radius: {self.radius}")
        print(f"normal:")
        self.normal.describe()

    def intersection(self, ray):

        e = [True, False]

        for i, normal in enumerate([self.normal, self.normal.invert()]):      # got to check both sides of disc!

            denom = ray.D.dotProduct(normal)

            if denom < 0:
                p = self.centre.subtractVector(ray.origin)      # offset on plane
                t = p.dotProduct(normal) / denom           # distance

                if t > 1e-10:
                    p = ray.origin.addVector(ray.D.scaleByLength(t))    # point
                    v = p.subtractVector(self.centre)                   # disc centre to point
                    d2 = v.dotProduct(v)

                    if d2 < self.radius2:
                        return Intersection(
                            intersects = True,
                            is_entering = e[i],
                            distance = t,
                            point = p,
                            normal = normal,
                            object = self
                        )

        return Intersection()

    def isInside(self, point):
        return False


class Pipe():
    def __init__(self, start_centre, end_centre, radius, material=Material(), colour=Colour(128, 128, 128), id=0):
        self.start_centre = start_centre
        self.end_centre = end_centre
        self.radius = radius
        self.material = material
        self.colour = colour
        self.id = id
        self.axis = end_centre.subtractVector(start_centre)

    def describe(self):
        print(f'start_centre:')
        self.start_centre.describe()
        print(f'end_centre:')
        self.end_centre.describe()
        print(f'radius: {self.radius}')
        print(f'axis:')
        self.axis.describe()

    def intersection(self, ray):

        ra = self.radius
        ba = self.end_centre.subtractVector(self.start_centre)

        oc = ray.origin.subtractVector(self.start_centre)
        rd = ray.D

        baba = ba.dotProduct(ba)
        bard = ba.dotProduct(rd)
        baoc = ba.dotProduct(oc)

        k2 = baba - bard**2
        if k2 == 0:
            return Intersection()

        k1 = baba * oc.dotProduct(rd) - baoc * bard
        k0 = baba * oc.dotProduct(oc) - baoc**2 - ra**2 * baba

        h = k1**2 - k2*k0

        if h < 0:
            return Intersection()

        h = math.sqrt(h)
        t0 = (-k1-h)/k2          # distance to first intersection
        t1 = (-k1+h)/k2          # distance to second intersection

        e = [True, False]

        for i, t in enumerate([t0, t1]):
            y = baoc + t*bard
            if (y > 0) & (y < baba) & (t > 1e-10):

                # point of intersection?
                phit = ray.origin.addVector(rd.scaleByLength(t))
                nhit = phit.subtractVector(self.start_centre).crossProduct(self.axis).crossProduct(self.axis).normalise().invert()

                return Intersection(
                    intersects=True,
                    is_entering=e[i],
                    distance=t,
                    point=phit,
                    normal=nhit,
                    object=self
                )

        return Intersection()

    def isInside(self, point):
        return False

Composite shapes

Joyfully, we can build composite shapes using polymorphism. A Cylinder comprises two discs and a pipe, and uses the intersection logic of each component. Here, though, we do include a containment function at the parent level.

class Cylinder():
    def __init__(self, start_centre, end_centre, radius, material=Material(), colour=Colour(128, 128, 128), id=0):
        self.id = id
        self.material = material
        self.pipe = Pipe(id=id, start_centre=start_centre, end_centre=end_centre, radius=radius, material=material, colour=colour)
        self.start_disc = Disc(id=id, centre=start_centre, radius=radius, normal=self.pipe.axis.invert(), material=material, colour=colour)
        self.end_disc = Disc(id=id, centre=end_centre, radius=radius, normal=self.pipe.axis, material=material, colour=colour)

    def describe(self):
        print(f'PIPE:')
        self.pipe.describe()
        print()
        print(f'END 1:')
        self.start_disc.describe()
        print()
        print(f'END 2:')
        self.end_disc.describe()
        print()

    def intersection(self, ray):

        nearest_intersection = ray.nearestIntersect(
            objects=[self.pipe, self.start_disc, self.end_disc]
        )

        return nearest_intersection if nearest_intersection != None else Intersection()

    def isInside(self, point):

        # distance to axis
        d = self.pipe.start_centre.subtractVector(point).crossProduct(self.pipe.axis).magnitude() / self.pipe.axis.magnitude()
        if d > self.pipe.radius:
            return False

        # distance to start end
        if self.pipe.start_centre.subtractVector(point).dotProduct(self.start_disc.normal) < 0:
            return False

        # distance to end end
        if self.pipe.end_centre.subtractVector(point).dotProduct(self.end_disc.normal) < 0:
            return False

        return self.pipe.radius

Streamlining recursive refraction

Accurate refraction relies on correctly resolving recursive interactions. Once working, straws appear to bend in fluid, and bubbles exhibit realistic total internal reflection.

Moving images

Rendering the images above at a reasonably high resolution takes about 20–40mins. A goal of this project is to produce a 3D graphics engine fast enough to support development of a 3D physics simulation — which would be useless without a way of seeing the results. At current performance, it would take around 20hrs to render 1 second of medium-resolution video at 60fps. We therefore require performance increases in the two orders of magnitude range.

Optimisation Strategies

I was confident that there was plenty of headroom to optimise this Python implementation, though 100x seemed unlikely. I tested some potential avenues:

A different architecture? Testing showed that using vectorisation, or reorienting to numpy, were slower than both the original object-oriented approach, and the implementation of linear algebra from first principles. No improvement.
Cython? Rewriting sections of code in Cython did yield performance increases, but only in the region of 2–5x. Cython also has many drawbacks in this kind of development workflow.
Optimised algorithms? Some pruning strategies are obvious. We can, for example, stop checking rays in a given axis if they are no longer hitting an object. Implementation requires precomputing, overhead, and negotiating error tolerance, but is faster in the 1–2x range.

We are therefore left with the unfortunate conclusion that the only way to squeeze more performance out of the engine would be to rewrite it from scratch in another language, such as C++.

After rewriting the entire engine from scratch in C++, (identifying a few minor language-agnostic optimisations on the way), we can compare render times for the exact same image:

With default compile settings (-O1), a render that takes 43s in Python now takes 575ms. This is a 75x increase in performance.
But things don’t stop there. With compiler optimisation turned on (I found -O2 to be optimal), we see a further 3.5–4x increase.
This amounts to a 300x increase in speed for the new C++ version — far greater than I had hoped.

This makes MOVING IMAGES a possibility.

Moving just the camera around the scene

Animating camera and objects in the scene

Testing a sine function for easy fluid effect