Performance through Elegant JavaScript

Making code faster while also making it easier to read

Jan Pöschko
Wolfram Developers
10 min readDec 11, 2018

--

In the implementation of the Wolfram Cloud notebook interface, performance is super-critical. We’re essentially writing our own layout engine in JavaScript, to support the rich typesetting constructs that notebooks offer.

Typesetting, graphics, and interactive elements all come together in a notebook, which needs a custom layout implementation. There’s no time to be wasted when rendering stuff — especially when you have no idea what that stuff is going to be, because it’s entirely defined by the user.

There are optimizations on multiple levels, some of the more important ones probably being things like

  • avoiding unnecessary work altogether (which includes only loading stuff when necessary),
  • algorithmic improvements, and
  • using the browser’s DOM efficiently (e.g. batching writes and reads).

When you run into performance problems, there’s no way around looking at the timeline and profiling tools (I really like the Chrome DevTools) to identify parts of your application that need optimization.

However, in this post I’d like to focus on a lower level and show how to write performant code from the beginning, without sacrificing “elegance”. Quite the contrary, actually: Sometimes we can make code faster by making it more elegant at the same time. Here are some of the lessons I learnt.

Be aware of the hidden class behind every object

A naïvely implemented JS engine would probably represent each JS object like obj = {x: 1, y: 2} as a hash map with property names as keys. But that would be slow. So what modern engines do is, they keep track of the shape of an object (i.e. what properties it has), and for each possible shape they introduce a hidden class (think of an actual class as in C++, Python, or perhaps a simple struct in C). The point being that for such a class they know exactly where (in memory) each property value is going to be, without going through some complex hashing algorithm etc. So next time they see a property access like obj.x, it's a simple lookup at a certain memory offset. That's a huge safe, when you think about how often this happens in typical JS code.

An important detail is that hidden classes are constructed iteratively as an object gets more properties. E.g. in

the hidden class of obj would be an empty class C0 first, then it would be a class C1 with a single property x, and finally it would be a class C2 with properties x and y. When we create another object later and assign properties in the same order (first empty, then x, then y), it will share the same hidden class(es) and end up with C2. However, if we constructed

it would end up with an entirely different hidden class. You might think the “shape” is the same — but order matters.

As adding properties changes the hidden class, so does removing properties. In order to avoid that, it’s usually best to not remove properties from an object at all.

Why do we want to avoid extra hidden classes? Well, less is more, right? Less stuff to keep track of for the engine means faster execution. But the real answer is in the next section about functions.

Also note that the values of properties don’t really matter here. The hidden class is only about what properties an object has.

Another detail: Constructor functions like

get their own hidden class, i.e. even though the overall shape might be the same, that p will have a different hidden class from the obj from before (which was constructed using an object literal, not a constructor function).

So here are the key take-aways:

  • Initialize all properties in a constructor function (e.g. set them to null if there's no other sensible default value), to avoid shape changes later.
  • Make sure properties are always assigned in the same order (no matter whether it’s in a constructor function acting on this, or whether it's changing some other variable like a previously created object literal).
  • Avoid deleting properties from an object (with the delete operator). Rather assign a value of null or undefined to them.

And here’s the kicker: This is not only good for the JS engine, it’s also good for you and other readers of your code! Getting an immediate understanding of what properties to expect in an object by reading its constructor function is a good thing. Consistent order is good. Not having properties suddenly removed underneath you is good. It makes your code easier to reason about.

AFAIK, these principles apply to all modern JS engines. The term “hidden class” comes from V8 (Chrome’s JS engine), whereas SpiderMonkey (Firefox) likes to call them “shape”.

Keep your functions monomorphic

JS engines don’t stop with their optimizations at property access. Next up: functions, the fundamental building (scoping) blocks of all JS programs.

What’s there to optimize? To execute a function, the engine just has to “run through it”, right? Yeah, that’s what an interpreter would do. But JS engines are smarter than that. They’re really compilers; just-in-time (JIT) compilers, to be precise. They use type information collected at runtime to generate optimized machine code. E.g. if they “know” two variables are numbers, they can generate specialized code to make adding these variables really fast; if they don’t know anything about the two, they have to go through some generic code instead (e.g. they have to check whether strings are involved, in which case they would concatenate strings instead of adding numbers).

Type information? What does that really mean? Remember the hidden classes from before? That’s pretty much exactly what this means! Think of hidden classes as type information attached to each object. Plus, values like numbers and strings also have a flag that indicate what type they are.

But now back to functions. Each time (more or less) a function is called, the JS engine checks:

  1. What are the types of the input parameters?
  2. Have I already compiled a specialized version for these types? If yes, execute that specialized version.
  3. Otherwise, generate a specialized version, compiling it to machine code that makes corresponding assumptions about the parameters — unless I have already done this too many times (let’s say four) for this function, in which case I’m just going to give up and keep using a generic version for this function.

Some examples for operations that can be turned into specialized machine code:

  • Adding variables can take a more direct code path under the assumption that they are numbers.
  • Accessing a property can take a more direct code path, assuming a certain hidden class of the object.
  • Calling other functions might jump into one of their specialized versions directly if there are already assumptions about the parameter values being passed into the function. Heck, it might even avoid a “jump” altogether and inline the function right at the caller’s site (if the called function is simple enough to allow for that).

What does this mean for our code? It’s quite simple:

  • Prefer functions that are always called with the same parameter types. The function is then called monomorphic, i.e. there’s effectively only one (mono) form (morph) of it.

What you want to avoid is a function like

being called in different ways like

which would result in two separate specialized versions of add. Extrapolate that, and the engine would soon give up and not use an optimized version of the function at all anymore — it would get de-optimized.

But hey, you want to avoid code like that anyway, right? Reading a function where it’s unclear what the parameters really are and how they’re going to be used (e.g. whether + means addition or concatenation) not only annoys the JS engine, it can also be confusing to the reader of your code.

We don’t want to exploit the full dynamic power of JavaScript where everything can mean anything, because that not only potentially slows down the code, but it also makes it harder to read.

Incidentally, a very helpful “reminder” of this is to actually add explicit type annotations to your code, such as Flow or TypeScript. There are different opinions on this, but I personally like the explicitness of code like

and my point in this context is: It also makes you fall into the pit of success as it naturally forces the function to be monomorphic and hence easier to optimize.

Choose the right data structures

This is probably neither a surprise nor specific to JavaScript: Choosing the right data structure matters.

In old-school JavaScript, we essentially had two “container” types: Object and Array. (OK, arrays are really objects as well, just that they maintain a special lengthproperty.)

In ES2015 and beyond, we have a lot of new cool containers to choose from: Map, Set, WeakMap, and WeakSet. There are also typed arrays but they’re not generic containers so I won’t talk about them here (also, while they can be super-fast, I wouldn’t necessarily call them “elegant”).

In the early days when some of these containers were just finding their way into browser implementations, I still shied away from using Map and friends because I thought they weren't as performant as plain objects yet. (Maybe because of some evil micro-benchmark?) Anyway, those days are over.

So the choice of container comes quite naturally now. For key-value mapping-like data:

  • If you can control and enumerate all the keys of the key-value mapping and they’re typically the same for all objects of a kind, use a plain Object (i.e. an {} object literal, a constructor function, or an actual class with properties), pretty much like you would use a struct in C or a class in C++.
  • If the keys are not under your control (e.g. they’re provided by the user) or they’re not even simple strings, use a Map.

From the perspective of other languages: When you would use a struct or a class, use an object in JS; when you would use a hash map, use a Map in JS.

For list-like data:

  • If you want fast random access by an index, use an Array.
  • If you want fast inclusion tests (or maybe I should just say: when you deal with a set), use a Set.

Some more notes on arrays:

  • JS arrays can be sparse, e.g. you can do something like:

Rather don’t do that. JS engines are somewhat smart and will usually switch from “compact mode” to “dictionary mode” when they encounter an array like this (so they don’t actually have to reserve an absurd amount of memory); but that switch comes at a cost, too. If your data is sparse like that, a more natural choice would probably be a Map to begin with.

  • Some people like to iterate arrays backwards as in

instead of

because it saves an assignment and a separate increment statement. Rather don’t do that. It’s not only a little odd to read, but CPUs and their caches are usually optimized for accessing memory in forward direction, so there could even be a penalty for iterating backward. (Even though there’s nothing in the JS language spec that guarantees how things are laid out in memory, I’m pretty sure all engines allocate arrays the natural way.)

  • Inserting and removing elements from the middle of an array is a costly operation, because usually a lot of other data has to be moved around. In that case (and if random access by index doesn’t matter), a linked list might be a better choice. There’s no built-in type for linked lists, but it’s easy enough to roll out your own. Except: Most of the time, linked lists are a premature optimization. So maybe forget about this point altogether.

This blog post has some more (lower-level) performance tips regarding arrays.

When to use the “weak” variants WeakMap and WeakSet? Probably not all too often, but essentially whenever you want to associate data with objects without mutating the objects themselves, either because you can't (might be some immutable object passed to you) or because it would be some property that doesn't really belong on the object in general (see the first section about hidden classes on why it's a bad idea to add a property some time after the object has already been constructed). This answer on StackOverflow describes it pretty well.

Conclusion

I guess everyone would agree that

  • declaring and using object properties consistently,
  • having predictable function parameters, and
  • using the natural data structure for a job

makes code easier to read and understand. What’s great is that not only humans like code written this way, but JavaScript engines do as well, making your code more performant and more elegant at the same time.

To some degree, using a static type system such as Flow or TypeScript can encourage these good practices, e.g. by making it more natural to write monomorphic functions (or, to put it differently: by making it more cumbersome to write polymorphic ones).

Further reading

The following resources were super useful for learning about the things covered here. I recommend them if you want to dive deeper into JS performance.

PS: This text was originally written in 2016, but it should still be accurate (and hopefully relevant), hence I decided to finally publish it, after some minor editing.

--

--

Jan Pöschko
Wolfram Developers

Software developer at @WolframResearch, photographer, traveller.