Optimizing Graphing Performance on the Web
For me to stay engaged as an engineer, I need to try new things. Programming is fun when there’s a puzzle to solve or a new way to express an idea. When I’m trying to learn a new tool, I’ll often go searching for a problem that I can use it on. Needless to say, when I try to find a project for an unfamiliar tool (rather than vice-versa), I often end up building something for which the only positive description is “creative”. Other times though, I have a project that I’ve wanted to do for a while, and a technology that I’ve always wanted to learn, and they serendipitously work well together. Hopefully this is one of those stories.
When I decided to build a Mandelbrot set viewer, it was not because I felt the world needed one more implementation on top of the many that already exist. Instead, it was because I had a Rust/WebAssembly/Web Worker-shaped peg that I wanted to work with, and a feeling that the Mandelbrot set was the right shaped hole to hammer it into. Luckily for me, the Mandelbrot set was actually a great fit, and through my project I learned a lot about these new technologies and web performance in general.
What is the Mandelbrot Set?
Disclaimer: I am not a mathematician. For a more in-depth overview of the Mandelbrot set, check out https://www.youtube.com/watch?v=FFftmWSzgmk.
The Mandelbrot set is a mathematically-defined construct that is based on a very simple formula.
It can be plotted to produce an infinitely complex shape that looks like this:
If you were to progressively zoom in on the border of the black region, you would find an infinitely deep world of interesting topological features. With the idea of exploration in mind, I wanted to build a web-based tool, inspired by Google Maps, that lets the user move around the plot, zoom in or out, and play with some of the algorithm parameters to see how they affect the rendering. Like Google Maps, I wanted the user to be able to click and drag to pan, or scroll to zoom. I’d suggest using the viewer above to try it out.
How Does the Mandelbrot Set Work?
For all the complexity in the Mandelbrot plot, it is defined by this simple function:
and the following two details:
- The function is applied iteratively for every pixel in the graph and the pixel color is determined by the result.
- The function is evaluated using complex numbers rather than plain real numbers. Each pixel in the image corresponds to a point in the complex plane.
The function and these two details are the main source of visual complexity in the plot above. If you think that some additional “information” must be needed to calculate a shape with so much visual detail, you would not be alone. The fact that such simple rules can generate such interesting structures is one of the early inspirations for the study of Chaos Theory.
The following section is an overview of complex numbers. If you are already familiar with the topic, feel free to skip ahead to the next section.
- Real numbers are the normal numbers of everyday life. They include everything from negative infinity to positive infinity and all the fractions / decimals / irrational numbers in between.
- Imaginary numbers are a bit different. They allow you to answer the question “What is the square root of a negative number?” If it is unclear why the answer to that question is not a real number, try to think of a number that, when multiplied by itself, is -4. The basic “unit” of imaginary numbers is i, the square root of -1. Other imaginary numbers are expressed as multiples of i.
A is the sum of a real number and an imaginary number.
In a graphical sense, a complex number can be visualized as a point in space. Whereas the real numbers have a “number line,” the complex numbers have a number plane. A complex number can be represented as the point (a,b) in the complex plane. The horizontal axis represents the size of the real part and the vertical axis represents the size of the imaginary part.
Complex numbers can be added, subtracted, and multiplied, but it takes a bit more thought than plain real numbers. To add complex numbers, separately add the real parts and the imaginary parts. To multiply complex numbers, use the FOIL Method to calculate the new real and complex parts.
Finally, complex numbers can have an absolute value. Graphically, this is the length of the line on the complex plane and can be calculated using the standard formula for the length of a 2D vector:
Calculating the Mandelbrot Set
To plot the Mandelbrot set, you iterate through each pixel in your plotting space and calculate its color. The plotting space represents a rectangle in the complex plane and every pixel in the plot represents an individual complex number. To decide the color of a pixel, you can use our function from above, plug in 0 for z, and plug in the coordinates of your pixel for c. Evaluate the result as a complex value. Do this calculation again, replacing z with the previous result and leaving c as your original pixel’s coordinates. The color of a pixel is determined by how fast the absolute value of z grows as you repeat this process infinitely. For instance, if c = 1, you will calculate the following partial sequence:
In this case, z grows quickly and would get a color that represents fast growth. Doing the same calculation with c = -1 gives a sequence where z grows slowly and would be given a contrasting color.
For our viewer, rather than iterate to infinity, we might iterate through the sequence at most 500 times and if the absolute value ever gets larger than 2, we can guess that the input grows quickly and is colored blue. Slow growth (z never gets larger than 2) will be associated with black.
There are a few important takeaways from what this algorithm will look like when converted to code:
- Complex number multiplication takes 4 multiplications and 3 additions. Pixels that correspond to a diverging number will take ~500 iterations of our function.
- A 640x640 plot has ~409k pixels. For some plotting regions, most coordinates will not diverge and will require all 500 iterations to confirm that. Pixels that diverge (and diverge quickly) are much cheaper to calculate.
- Therefore, some renderings of the Mandelbrot set may require hundreds of millions or even billions of arithmetic calculations.
- Most modern processors run a maximum of ~3 billion instructions per second in a single execution thread / core. Therefore, plotting the Mandelbrot set may be a computational challenge.
A Simple (Naive?) Mandelbrot Set Implementation
To get started on this project, I decided to plot a still image of the Mandelbrot set synchronously and without thinking about performance. Like a good programmer, I modeled complex numbers as an object with one field for the real part and one for the imaginary. For complex number arithmetic, I wrote pure functions that take in complex numbers and return new complex numbers. I used the to plot the Mandelbrot set since I wanted the ability to draw pixels directly. I ended up not pulling in any dependencies for complex arithmetic or graphing because my use case was simple to implement and I wanted full control to tune for performance.
Here’s an abridged sample of my initial Mandelbrot set implementation. If you want to see the full plotting logic, check out my page above:
This leaves out some of the plotting logistics but captures the main idea of calculating the Mandelbrot set. The final drawPixel function call takes the iteration count and uses it to decide the final pixel color. Also note that the iterateMandelbrot function will short circuit when the absolute value of the complex number gets too large.
So how well does this initial attempt at plotting the Mandelbrot set work? The code above is functionally correct but takes a decent amount of time to compute. If I use this approach to plot the Mandelbrot set on my 1080x1920 monitor and a MacBook Pro, it takes around 7.5 seconds to render after page load has completed. Timings will vary depending on your specific hardware but it at least shows me that I’ll need to pay attention to performance. Remember that in this simple implementation, all of this calculation is done synchronously. A 7 second synchronous wait is always going to feel slow.
Improving the Naive Approach
So how do we improve the slow naive approach? A good place to start troubleshooting web performance problems is in the developer tools. Here is a screenshot of the flame chart for what the browser is doing while calculating the Mandelbrot set.
If you want to follow along with the performance troubleshooting, try using the live Mandelbrot viewer and select “Naive JS” and then uncheck “Calculate on Worker Threads”. Finally, open the developer tools, start recording a session in the Performance tab, and hit redraw. This should give a good performance metric since it is purely dependent on already-downloaded client-side code.
As a sanity check, it is a good sign that the main bulk of the rendering time is spent in functions that I’ve written (vs browser rendering). The next thing we’ll want to figure out is where the browser is spending the majority of its time. For this, I can look at the “Bottom-Up” view in the same performance tab:
This “Bottom-Up” view is pretty unusual. The browser is spending more than 3 seconds doing garbage collection (Minor GC in the screenshot) for a 7 second calculation. Garbage collection in modern browsers is pretty efficient and doesn’t normally take this long. Thinking a bit longer on the naive implementation above though, this result makes a lot of sense. Calculating the Mandelbrot set involves a huge number of complex number operations. By trying to adhere to functional programming practices and making each complex number a new object, I hit a huge performance trap. Every complex math operation created a new object that eventually needed to be garbage collected. In normal applications, object allocation is cheap and fast but since I was creating (up to) hundreds of millions of objects, it could add up.
So how could I reduce object allocation and therefore garbage collection? One option would be to drop the idea of immutability in my math operations. For instance, my math operations could modify one of the objects that I pass in. That felt sloppy though. Which operand would I modify? Would that choice be safe as I refactor my code? Besides, it just felt wrong to have a math operation called “add” and have it modify one of the numbers. Another operation might have been to use a somewhat exotic JS memory management technique like Object Pools. That felt like overkill for a fairly simple project and would’ve required additional dependencies and/or restructuring my code.
The optimization approach I decided to go with was not the cleanest but I felt like it was the best compromise in simplicity and readability. I opted to break up the complex number object back into primitive parts and pass them around as two separate parameters. If I wanted to reduce object allocation, I could just avoid objects altogether!
Here’s an example of what my code looked like with no objects:
All of the yellow frills at the bottom of the flame chart are gone and the Bottom-Up view shows no time spent in Garbage Collection at all!
More Exotic Solutions: WebAssembly
WebAssembly is a lot like the JVM’s bytecode. It is not something that you would generally write by hand but it can be a compilation target from many different languages. WebAssembly, like bytecode or assembly language, is much closer to the actual machine code that executes on your computer. It has access to a linear block of memory, uses no garbage collection, and has few higher-level programming language features.
Disclaimer: I am pretty new to Rust and this is intentionally a 1–1 port. Please be gentle.
Over the course of this project, I have had the chance to break a few of the normal standard rules of clean coding. Rather than group related data into an object, I represented it as primitives and passed them in separate variables. Rather than writing single-responsibility helper functions to do calculations, I did my math inline. I also brought a new programming language and compilation target into my toolbox to try to squeeze as much single-threaded performance from the browser that I could. For most “normal” web applications, this amount of jumping through hoops for performance is not really necessary. Premature optimization is the root of many problems and apps are normally slowed down way more by I/O than computation. The web is constantly taking on more responsibilities though, and it is becoming possible to run more computationally intensive applications in the browser. 3D games, client-side image processing and other tasks that normally are reserved for native applications seem to be gaining a foothold in the browser. This is where tuning for computational performance may become important again.
DISCLOSURE STATEMENT: © 2021 Capital One. Opinions are those of the individual author. Unless noted otherwise in this post, Capital One is not affiliated with, nor endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are property of their respective owners.
Originally published at https://www.capitalone.com.