Illustrating Code: The Art of Visual Learning

Josh Weinstein
The Startup
Published in
5 min readJul 23, 2020
Photo by Alice Dietrich on Unsplash

When people thinking of coding, programming or engineering, they think of it has a “hard” science. One that values mathematical abilities over creativity and design. Programming certainly does involve math, but the idea it involves no visual or creative thinking couldn’t be further from the truth. In fact, a major component of software engineering is visualization, the process of constructing images from the data or output of a program. Such an ability is vital to administrators or data scientists that oversee large data systems. It’s also an extraordinary tool to help learn programming or debug software.

Visualization can be accomplished through a number of different means. Despite the widespread use of photography formats such as JPEG or PNG, creating such file formats though a language like Python is difficult. Creating a PNG, for example, requires pixel oriented arrays of RGB data, that are compressed using a compression library like zlib. Constructing an image through pixels is certainly possible, but tricky, since drawing something like a circle or a line requires calculating geometric equations for the respective shapes.

A far more efficient format for visualizing code is scalable vector graphics (SVG). It’s a xml dialect that describes the shapes and elements of a graphic through xml elements, such as <circle cx="30" cy="30" r="10"/> . SVG graphics have a wide array of element types, like rectangles, circles, lines, polygons, bezier curves, text and much more. Since an SVG is really just XML, it’s easy to formulate with any programming language. It’s no different than building other kinds of string objects.

SVG, a clean, concise visual format

Let’s take a look at a basic graphic to get an idea of how a file SVG document would look like.

<svg version="1.1"     xmlns="http://www.w3.org/2000/svg">  
<path d="M100 0 L 100 20 L 20 10 L 10 0 Z" fill="#aaa"/>
<circle cx="10" cy="10" r="2" fill="red"/>
<circle cx="30" cy="10" r="2" fill="red"/>
<circle cx="50" cy="10" r="2" fill="red"/>
</svg>

The first, and highest, enclosing element is the <svg> element. This element stores the version of the SVG specification the document adheres to, as well as the XML scheme for SVG the renderer should follow. These two fields are only needed for SVG files that are open and rendered on locally hosted systems. When SVG is embedded in an html document, these fields are not needed, as the version of the SVG specification is determined by the browser. The <svg> element can also contain fields that specify the width and length of the entire graphic. Although those are necessary and useful in situations where a graphic is made to fit into a webpage, the examples in this article are meant to just show rough visualizations for a programmers benefit. Thus, leaving the width and length unbounded is fine.

The <path/> element describes a polygon shape thats filled with the hex color #aaa . Paths form a shape by a series of commands read in the d field. In this particular example, M means “move to” in terms of moving the path cursor to some xy coordinate. L means “line to”, and Z means “close path”, or to connect the shape and fill it. The path element also has curve instructions, but those are beyond the scope of this tutorial.

The third element you can see in this example is the <circle/> element. Defined by the name, it draws a circle in the graphic with some center point at an xy coordinate and some radius. Finally, let’s have a look at what this SVG example looks like once rendered:

As expected, you can see three red dots, and the polygon drawn with the path element

Creating SVG with Python

An SVG document overall is just an XML string. It’s fairly easy to construct such strings with a specialized class in Python. Similar to how different SVG elements are rendered to different shapes and carry different visual effects, classes in Python represent types with different methods and properties. Therefore, each type of SVG element we use can be encapsulated in class. Here’s one for a <circle> element:

class SVGCircle(object):def __init__(self, x = 0, y = 0, r = 1, f = 'red'):
self.x = x
self.y = y
self.r = r
self.f = f
def __repr__(self):
return '<circle cx="{}" cy="{}" r="{}" fill="{}"/>'.format(
self.x, self.y, self.r, self.f
)

The above class holds SVG specifications for a circle, and interpolates them in string form. Next, we need a class that can encapsulate an entire SVG document, and potentially other types of SVG elements:

class SVG(object):

def __init__(self):
self.objs = []
def __repr__(self):
return "\n".join(["<svg version=\"1.1\" xmlns=\"http://www.w3.org/2000/svg\">",
"\n ".join([str(obj) for obj in self.objs]),
"</svg>"])
def circle(self, x, y, r, f):
self.objs.append(SVGCircle(x, y, r, f))

The SVG class allows dynamic construction of an SVG document string with circle elements. With each call, the position or even the size of the circle, as well as the color, can be changed. To start with, just as a basic example, let’s write some code that will draw a dotted, diagonal line. We do not need the line element from SVG, we can use this with the code for circle written so far.

a = SVG()
for i in range(0, 100, 4):
a.circle(i, i, 2, "blue")
print(a)

This code will create an SVG string that looks like the following graphic:

Such an svg file can be created by running a python script with the code listed above and storing the output in a file, such as with the command $ python svg.py > foo.svg or by writing more python code to write the string to a file. The next examples will go over more applications of SVG visualization of code.

Inspecting input and output, visually

Next, we will look at an example where visualization of data can actually give insights into the behavior of code. To do this, two hash functions will be compared, Python’s builtin hash function, and a custom written implementation of the djb2 hash function. The strings of all integers from 0 to 200 will be hashed. In the resulting SVG, the x-coordinate for each dot will be the input, and the y-coordinate will be the output. The djb2 function looks like this:

def djb2(string):
digest = 5381
for i in range(len(string)):
digest = ((digest << 5) + digest) + int(string[i])
return digest

The color for Python’s builtin hash function will be blue, while for djb2 it will be red. It’s also important to remember the coordinate system for SVG has a descending Y axis as opposed to an ascending one. Running the following code:

a = SVG()
for i in range(200):
a.circle(i, djb2(str(i)) % 200, 1, "red")
a.circle(i, abs(hash(str(i))) % 200, 1, "blue")
print(a)

Produces the following graphic:

The X-axis of this graph goes from left to right, from a lesser input to greater input. The Y-axis goes from top to down, the lower the Y position, the greater in value the output is. Python’s builtin hash function, the blue dots, generate greater entropy and spread among outputs. This means that in a real use case, Python’s hash function would create less collisions if inputs were very similar, such as "4" and "5" . The djb2 hash function, the red dots, tend to cluster in regions, and don’t offer as much variability. This will lead to more collisions if inputs are very similar.

Overall, learning to visualize the behavior of code through constructing graphics is an incredibly useful skill. Fostering visual thinking leads to more problem solving skills.

--

--

Josh Weinstein
The Startup

I’m an engineer on a mission to write the fastest software in the world.