Coding a 3D ray-tracing graphics engine in Python (Part 1)

William Seymour
11 min readFeb 1, 2023

--

What is ray-tracing?

Ray-tracing is a method for turning information about 3D objects into a 2D image. It’s at the heart of bleeding-edge, photorealistic 3D graphics, and is a key development area for graphics card manufacturers. But it is also, in principle, very simple — meaning it’s not too difficult to get a working prototype up and running.

How does it work?

Ray-tracing simulates the path of imaginary rays of light. In everyday life, light follows very predictable rules: it is emitted from sources, bounces off things, and then either gets absorbed or enters your eye. It would be computationally prohibitive to simulate the path of every possible beam of light in a 3D space — so in ray-tracing we take the shortcut of simulating only the beams that enter the camera.

To do this, we invert each ray’s direction and follow it until it finds a light source.

Ray-tracing

Why ray-tracing?

Compared to other techniques, it is very simple to achieve photorealistic effects. This is because features such as highlights, shadows, shading, reflections, refractions, perspective, and distortion are all natural consequences of following rays — just like in real life! Even figuring out which objects should be rendered in front of others can be a headache with other methods.

The main downside is that ray-tracing is computationally expensive. Without optimisation, a 2000x2000 pixel image requires solving for 4 million rays. But since we’ve already chosen to use Python for this project, we know that performance is not a relevant concern.

I aim to build a ray tracer that can handle a wide range of objects:

  • Spheres

That’s it. Spheres are mathematically by far the simplest object when it comes to interactions with lines in 3D space, and we don’t need any other kinds of objects to prove the concept.

As ever, we start simple and escalate…

Step 0: Object Model

We suspect things will get pretty complicated pretty quickly, so we make best use of object orientated programming to encapsulate useful functions with the relevant objects.

You can certainly manipulate vectors using numpy alone, but since 3D vector operations are an order of magnitude more head-scratching than their 2D counterparts, I found it highly beneficial to get the basics of linear algebra encapsulated in a Vector class from the outset.

class Vector:

@staticmethod
def fromNpArray(array):
return Vector(x=array[0], y=array[1], z=array[2])

def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z

def describe(self, caption=""):
print(f"{caption}x: {self.x}, y: {self.y}, z: {self.z}")

def getXYZ(self):
return self.x, self.y, self.z

def toNpArray(self):
return np.array([self.x, self.y, self.z])

def addVector(self, B, inplace=False):
if inplace == True:
self.x += B.x
self.y += B.y
self.z += B.z
return self
return Vector(self.x + B.x, self.y + B.y, self.z + B.z)

def subtractVector(self, B, inplace=False):
if inplace == True:
self.x -= B.x
self.y -= B.y
self.z -= B.z
return self
return Vector(self.x - B.x, self.y - B.y, self.z - B.z)

def invert(self, inplace=False):
if inplace == True:
self.x = -self.x
self.y = -self.y
self.z = -self.z
return self
return Vector(-self.x, -self.y, -self.z)

def scaleByLength(self, l, inplace=False):
if inplace == True:
self.x *= l
self.y *= l
self.z *= l
return self
else:
return Vector(self.x * l, self.y * l, self.z * l)

def distanceFrom(self, B):
return math.sqrt((B.x - self.x)**2 + (B.y - self.y)**2 + (B.z - self.z)**2)

def angleBetween(self, B):
return arccos(self.dotProduct(B) / (self.magnitude() * B.magnitude()))

def reflectInVector(self, B):
v = self.normalise()
normal = B.normalise()
return v.subtractVector(normal.scaleByLength(2 * v.dotProduct(normal))).normalise()

def dotProduct(self, B):
return self.x * B.x + self.y * B.y + self.z * B.z

def crossProduct(self, B):
# denoted by A x B
return Vector(
x=self.y*B.z - self.z*B.y,
y=self.z*B.x - self.x*B.z,
z=self.x*B.y - self.y*B.x
)

def magnitude(self):
# ||v|| denotes the length of a vector
dotProduct = self.dotProduct(self)
return math.sqrt(dotProduct)

def normalise(self):
magnitude = self.magnitude()
return Vector(x=self.x/magnitude, y=self.y/magnitude, z=self.z/magnitude)

def multiplyByMatrix(self, T):
return self.fromNpArray(np.matmul(self.toNpArray(), T))

def rotate(self, angle, inplace=False):
a, b, c = angle.x, angle.y, angle.z
R = np.array([
[cos(c)*cos(b)*cos(a) - sin(c)*sin(a), cos(c)*cos(b)*sin(a) + sin(c)*cos(a), -cos(c)*sin(b)],
[-sin(c)*cos(b)*cos(a) - cos(c)*sin(a), -sin(c)*cos(b)*sin(a) + cos(c)*cos(a), sin(c)*sin(b)],
[sin(b)*cos(a), sin(b)*sin(a), cos(b)]
])
V = np.matmul(np.array([self.x, self.y, self.z]), R)
if inplace == True:
self = Vector(x=V[0], y=V[1], z=V[2])
return Vector(x=V[0], y=V[1], z=V[2])

Not only does this keep things neat and unambiguous. It also means we will be able to chain functions together in a clear and condensed way:

vector = Vector(1, 1, 1)
new_val = vector.scaleByLength(2).invert().dotProduct(vector).normalise().magnitude()

Step 1: Detecting when a ray hits something

We define Ray as a class. We’ll add a function for calculating the discriminant (i.e. intersection) for any given sphere. In an additional function we can then loop through all the spheres, get the intersection details of each, and then return the intersection the ray hits first. Intersection is itself a class, which allows us to package a load of useful information together, such what has been hit, where it was hit, and the normal of intersection.

The normal of a surface is perpendicular to, and outwards from, it
# Ray class

class Ray:
def __init__(self, origin, D):
self.origin = origin
self.D = D.normalise() # direction in vector

def sphereDiscriminant(self, sphere, point=0): # set point to 1 when you want the second intersection
O = self.origin
D = self.D
C = sphere.centre
r = sphere.radius
L = C.subtractVector(O)

tca = L.dotProduct(D)
if tca < 0: # intersection is behind origin - this doesn't work when line is inside sphere
return Intersection()

d = None
try:
d = math.sqrt(L.dotProduct(L) - tca**2)
except:
d = 0 # crude error protection - in case D & L are too similar
if d > r: # line misses sphere
return Intersection()

thc = math.sqrt(r**2 - d**2)
t0 = tca - thc # distance to first intersection
t1 = tca + thc # distance to second intersection

tmin = [t0, t1][point]

phit = O.addVector(D.scaleByLength(tmin)) # point of intersection
nhit = phit.subtractVector(C).normalise() # normal of intersection

return Intersection(
intersects=True,
distance = tmin,
point = phit,
normal = nhit,
object = sphere
)

def nearestSphereIntersect(self, spheres, suppress_ids=[], bounces=0, max_bounces=1, through_count=0):

intersections = []

for i, sphere in enumerate(spheres):
if sphere.id not in suppress_ids:
intersections.append(self.sphereDiscriminant(sphere))

nearestIntersection = Intersection.nearestIntersection(intersections)

if nearestIntersection == None:
return None

if bounces > max_bounces:
return None

nearestIntersection.bounces = bounces
nearestIntersection.through_count = through_count

return nearestIntersection

We can now run this for each ray for a given set of spheres, and find out what it hit first. If the ray hit something, we’ll use the colour of the thing it hit. If it didn’t hit anything, we’ll use the background colour.

Already we have a basic ray-tracer that can tell which objects are in front of other objects!

Spheres in 3D!

Step 2: Global light source

To make our image a bit more realistic, we’ll want to return more than just the colour at the ray terminus. What we really want is the light value. To calculate this, we’ll need to define a light source and determine how much of it has reached the point where our ray strikes.

Using the normal of the surface — or the angle of the local tangential plane — we can calculate how much light it will receive from a global light source using a simple ratio. The more perpendicular our surface is to the light, the more light it gets.

For each global light source, we can choose the angle at which it is no longer reflected. To approximate bounce lighting, we can set the angle to 180 degrees. At 90 degrees, we get more of a planetary lighting look, as light reaches side opposite the light source.

class GlobalLight():
def __init__(self, vector, colour, strength, max_angle, func=0):
self.vector = vector # angle light is coming from
self.colour = colour
self.strength = strength # 0-1
self.max_angle = max_angle # the greatest angle light is reflected from - eg 90 degrees
self.func = func # 0: linear

def relativeStrength(self, angle):
if self.func == 0:
return self.colour.scaleRGB(incidence(angle, self.max_angle) * self.strength)
Global lighting: max angle = 180 degrees
Global lighting: max angle = 90 degrees

Step 3: Point light sources

To simulate point light sources, we’ll need to add them as objects to the scene and then do some additional ray-tracing from the original ray terminus to the point light to determine whether it has a clear line of sight. If it does not, it is in shadow.

If we want more than one light source, we’ll also need a way of aggregating illumination and applying it to the colour of the object.

"""
identifies and adds light value on a surface of a sphere based on:
- angle to global light sources and their light function
- angle to point light sources with line of sight (ie not in shadow) and their light function
- colour of self
- emission of self
- background colour
"""
def terminalRGB(self, spheres, background_colour=Colour(0, 0, 0), global_light_sources=[], point_light_sources=[], max_bounces=0):

reflectivity, transparency, emitivity = self.object.material.reflective, self.object.material.transparent, self.object.material.emitive

illumination = self.object.colour.scaleRGB(emitivity) # does not take distance from camera into account

for light in global_light_sources:
angle_to_light = self.normal.angleBetween(light.vector)
illumination = illumination.addColour(light.relativeStrength(angle_to_light))

for light in point_light_sources:
if self.object.id != light.id:
vector_to_light = light.position.subtractVector(self.point)
ray_to_light = Ray(
origin=self.point,
D=vector_to_light
)
ray_to_light_terminus = ray_to_light.nearestSphereIntersect(spheres, suppress_ids=[self.object.id], max_bounces=max_bounces)

if ray_to_light_terminus != None:
# clear line of sight
if ray_to_light_terminus.object.id == light.id:

angle_to_light = self.normal.angleBetween(vector_to_light)
distance_to_light = vector_to_light.magnitude()
illumination = illumination.addColour(light.relativeStrength(angle_to_light, distance_to_light))

# resolve final total of illumination
return background_colour.addColour(self.object.colour.illuminate(illumination))

In the image below, we’ve changed the global light source to blue and added a point light source behind the camera.

Our point light source logic achieves realistic shadows

Step 4: Reflections

This is where things start to get interesting. If the ray strikes a reflective object, we want to cast a new ray from the point of intersection in a direction determined by a simple formula:

def reflectInVector(self, B):
v = self.normalise()
normal = B.normalise()
return v.subtractVector(normal.scaleByLength(2 * v.dotProduct(normal))).normalise()

Inside the Ray object’s nearestSphereIntersect() function we can solve for the new reflection vector and then recursively call the same function with the new ray. As ever with recursion, it’s important to set a max recursion depth.

class Ray():

[...]

def nearestSphereIntersect(self, spheres, suppress_ids=[], bounces=0, max_bounces=1, through_count=0):

[...]

if nearestIntersection.object.material.reflective == True:

reflected_ray_D = self.D.reflectInVector(nearestIntersection.normal)

reflected_ray = Ray(
origin=nearestIntersection.point,
D=reflected_ray_D
)
bounces += 1
suppress_ids = [nearestIntersection.object.id]
reflected_terminus = reflected_ray.nearestSphereIntersect(
spheres=spheres,
suppress_ids=suppress_ids,
bounces=bounces,
max_bounces=max_bounces,
through_count=through_count
)

if reflected_terminus != None:
return reflected_terminus

return nearestIntersection

We change one of the big spheres to a reflective material, et voila!

Reflections

Step 5: Refractions

Mathematically, refraction is not that complicated. But from a programming perspective it can be troublesome — the refraction formula is sensitive to vector normalisation, normal direction, and has to respond to whether the ray is currently inside or outside an object.

Refraction in 2D

To simplify things, we can assume that when a ray enters a transparent object, the next interaction will be on the inside of the same object. When the ray is inside an object, we remember to swap the two refractive indices, and ensure that the normal is pointing towards the centre of the object.

def refractInVector(self, B, r_index_a, r_index_b):

v = self.normalise()
normal = B.normalise()

n = r_index_a / r_index_b

cosI = v.dotProduct(normal)
if cosI < -1:
cosI = -1
if cosI > 1: # ray is refracting off inside surface
cosI = 1

if cosI < 0:
cosI = -cosI

k = 1 - n**2 * (1 - cosI**2)

if k < 0:
return False # total internal reflection

return v.scaleByLength(n).addVector(normal.scaleByLength(n * cosI - math.sqrt(k))).normalise()

After much bug-fixing, we set the red planet’s material to glass (refractive index = 1.52), and marvel at the results. What is now effectively a spherical lens is behaving as expected — objects behind it are made smaller, the image is flipped, and light entering its extremities is severely bent.

Note that the glass ball should be focusing light rather than just casting a shadow — we’ll accept this inaccuracy for now.

And when we change the refractive index of the glass ball to something more like ice, we see that light is bent less.

Refractive index = 1.31

The glass marble in the previous scene is a little underwhelming. Let’s test the accuracy of everything we’ve done so far with some more interesting scenes.

A scene that includes global light sources, point light sources, a reflective sphere, and a refractive sphere
With multiple light sources, we can see that complex shadows are formed, and resolved through the spherical lens

The power of ray recursion

Because we’ve implemented reflection and refraction with recursion, our code can handle rays that get reflected and refracted multiple times. The only limit is acceptable render time. The following scene has multiple light sources, multiple reflective balls, and multiple transparent balls. All of these different interactions are naturally resolved simply by recursively re-tracing each ray through every interaction until it either a) hits something, b) hits nothing, or c) hits the max recursion depth.

Perspective and distortion

Because we’ve used a linear xy offset for each of our rays, we are effectively sending them out through a flat orthogonal grid. Since this grid is flat, it has the neat side-effect of reproducing distortion effects very familiar to the human eye thanks to photography, which uses flat sensing media.

If we spread the rays over a wider distance, we widen the field of view — distortions begin to emerge. Below is the same scene as before, but with the camera moved forward and the field of view widened. Apart from distorting objects near the edges of the frame, this has a dramatic effect on the relative size of objects.

A wide field of view results in distortion, just as in non-rectilinear camera projections

If you wanted to produce wide-angle images that preserve rectilinearity, you’d only have to cast your rays with an angular offset rather than a linear one. This would be akin to projecting onto a spherical surface, just like the human eye.

You can find all the code on my GitHub. I’ve kept things simple by tweaking the render inputs in Jupyter notebooks and generating images with matplotlib, so it should be nice and easy to have a play.

Enjoy!

--

--