HTML, JS, And State: A Challenging Way To Look At Web Performance 🔊

HTML byte size, streaming, GZip and caching

The picture of a rock concert. The camera is positioned on the back aiming at the crowd. Everybody has put their hands up facing the stage. Well… not quite the type of performance we’re talking about here anyway.
Listen to the audio version!

Some people complain HTML rendering (A.K.A. "server-side" rendering) requires the browser to download more bytes from the server. That is sometimes posed as a big problem and used to justify returning JSON with domain-specific definitions and write code to re-create the HTML in the browser.

Example 1: A “cart” endpoint that returns a JSON from the server representing a list of books which are then parsed by the Front-End into an Object Literal. The Object Literal contains a property called “items on cart” which holds an Array of Object Literals containing a key called "name" with a String as the value. The "name" key represents the book name. The Front-End has some JavaScript code that iterates over the “items on cart” and creates the HTML for an “unordered list”.

HTML uses repeated tags and attributes instead of square braces for lists. Of course, that will generate more raw bytes for the browser to download when the result is a huge list:

Example 2: The HTML representing the “unordered list” of books containing 21 items.

However, you can say most of the performance bottleneck in web pages are not the number of bytes transferred through the network. It's the user-perceived latency before the page can render the First Meaningful Paint. This is normally caused by the huge number of requests and the parsing of a lot of client-side JavaScript code.

[…] Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. […]
— Donald Knuth on "Structured Programming with go to Statements", page 268.

Writing Front-End code to optimize HTML byte size is probably not worth the effort.

In most of the cases, reducing the number of requests and the amount of JavaScript the browser has to parse is the better approach to increase performance.

Also, there's GZip.

The GZip algorithm (which uses DEFLATE) implements Huffman coding. Huffman coding can be more performant when there’s duplication in the string. That means GZip is more likely to yield a better result for duplicated HTML tags, than for a custom domain-specific data structure written in JSON.

If the "deflate" option is applied, using this site, to the lines 12–20 of Example 1, it results in an output of 112 bytes out of 150, which is a reduction of -25.3%. There are too many tokens and characters that don't repeat, so GZip can't efficiently compress the data.

If the "deflate" option is applied, using this site, to the Example 2, it results in an output of 118 bytes out of 766, which is a reduction of -84.5%. The compression is incredibly efficient because there are a lot of characters that do repeat.

The code used to calculate the percentage of byte size reduction.
Due to how GZip works, it will yield a better compression result for HTML than for a custom domain-specific data structure written in JSON.

HTML is streamable.

According to the specification, HTML parsing works like a state machine. It inputs to the parser each character of the markup from left to right. Each time the input is received, the parser changes the state.

Let’s say the parser is processing a button tag. The first step is to process the Less-than sign. After that, it changes the internal state to process the next characters b, u, t, t, o, n, which will represent the tag name, until it finds another Greater-than sign character. After that, the parser will change the internal state to process the text "Do something". Then, it will find another Less-than sign (<), a slash, the tag name b, u, t, t, o, n, and a final Greater-than sign to mark the end of the tag.

The code for an HTML button tag containing the internal text as "Do something".

Even before closing the tag, the browser already knows enough to render the button. Even if that means rendering a basic UI.

JSON, by default, is not streamable. If browsers supported JSON as a rendering mechanism, they would have to parse the whole string in order to render something on the page:

The code for a JSON string with the structure containing the property "buttons" that contain an array of objects. The array has a single object with the property "text" and the value as "Do something".

There are a lot of hacks to make JSON streamable. However, you can get it for free using HTML.

HTML is streamable by default. JSON is not.

It’s very common to use AJAX calls using client-side JavaScript to process server requests that return JSON. The browser will parse the JSON and recreate the HTML in the Front-end.

In that case, if the server adds new properties to the response, it can break the website for some browsers that may have that request cached.

When the browser does a GET request to the server, by default it caches the response based on the URL and query string. Subsequent requests that use the same URL and query string might force the browser to use the cached response instead of downloading the new content from the server.

There are also many misconfigured corporate proxies and open wifis out there outside your control. They can cache the Content-Types text/html and application/json differently.

Also, browsers tend to invalidate the cache of text/html pages more often than resources fetched through client-side JavaScript code. In some cases, for the HTML of the page to be updated by the browser, you just need to do a normal refresh without having to use Hard Refresh for the cache to be cleared.

Just because everything works in your local environment, that doesn’t mean it will work when somebody else tries to access your website in production.

The code for the Example 1 with the server returning a new property called “price” added to the Object Literals representing the books from the “cart” response. The old response was cached without the “price” property. Now the UI may render parts of the HTML with the visual price broken until the cache is refreshed for every client. Instead, it should render the price for the book or at least the previous version of the page in one piece.

At this point, you might start to hear excuses like "it works on my machine" or "just update the cache".

How can you avoid these problems?

A very common workaround to this issue is to append a unique identifier to the URL, such as the latest commit hash or a date timestamp. This way, the browser will never cache the response for that URL.

This is commonly known as "Cache Busting".

An example showing the URLs for two GET requests. The first one contains the “/cart” path with the query string named “cache bust” and a random number as the value. The second one has the first part of the path containing a random mix of letters and numbers and the "/cart" in the end.

However, the best way to handle caching is to use the Entity Tags and Response Caching. This way, the server can respond with various hints on which URLs should be cached and which ones should not.

However, that still doesn't work for intermediary Proxies. You still want the application to remain in one piece even in unexpected environments.

What's the alternative?

If you return HTML from the server and a browser receives the cached response, it will just render an old but working state of the website. When the cached response is refreshed in the client, the website will be rendered with the new content.

There will never be a broken state.

If you return HTML and the browser has it cached, it will always render a User Interface that works.

This is extremely powerful.

If you don't write code to recreate the HTML in the browser, you will never have to spend the time to fix the problems of rendering broken parts of the website. The browser will always render something usable despite it being cached or not.

Besides, if you're developing the website incrementally, you can deploy it with their default caching behavior in the first "step" of delivery. Later, after MVP, you can decide to improve the caching responsiveness using some of the techniques described above.

If you do this, you'll never have to deal with angry customers that can't use the website anymore just because you've added a new property to your server-side JSON.

If you use HTML efficiently, there will be less friction, more things done and better perceived performance.

HTML can be bigger than JSON. That has no meaningful impact when your bottleneck is probably somewhere else.

HTML is streamable so that the browser can start rendering the website without having to wait for the download to finish.

If the server returns HTML instead of JSON, the browser will never render a broken page when the contents are cached, it will always render an older but working version of it.

Performance is what the user perceives, not the technical details of what you believe it is.

What the user perceives is a page that is fast and works.


Don't waste your time to bikeshed on aspects of performance that make no difference. Look at web performance and maintainability problems for what they really are.

Understand the tradeoffs.

Don't try to reinvent the wheel and prematurely optimize things you will have to get back and fix later.

You might end up creating a lot of Technical Debt and Accidental Complexity on the way.

Decisions that can have huge consequences.

See also How To Use Technical Debt In Your Favor

Thanks for reading. If you have some feedback, reach out to me on Twitter, Facebook or Github.