Imagine you’ve got a craving for sweet rolls, because that’s how you (cinnamon) roll. Your local bakery, where you get your daily pastry cravings, is a little different. Here, you have to give the baker a description of the pastry you want so they know what to bake. Any ambigious detail will be left up to whims.
“I want one sweet roll, heavy on the cinnamon, a delicious white glazing on top and no raisins. Because screw raisins”.
The baker receives your order and some time later your sweet sweet cinnamon roll is done. However, it’s got purple glaze around the rim, too much cinnamon and with bits of raisins. God dammit.
“I told you no raisins!”, you tell the baker. He responds with “What is raisins?”.
Get where I’m headed with this? The baker interprets your order the way he wants and makes the pastry while you wait for the result at the other end. Leaving with a sweet cinnamon roll and mixed feelings because f*cking raisins and a retarded baker.
What if we could take matters in our own hands, instead of that baker making up his own fantasy of what our pastry should look like?
Keep this in mind, as we’ll get back to this (weird) bakery later.
Interpretation. A basic human skill required when you need to do something others tell you, either verbally or written. For as long as the modern, commonly known Internet has existed, there’s been a central commission that, in agreement with industry members, tells us how all things web should work. One of these things is of course the HTML standard.
Now, while W3C aren’t completely without faults or controversies (looking at you EME), for the most part specifications are sane, approachable recipes on how an element or an abstraction feature should work. Well, in theory at least. As we all know, and quickly see, they’re more of a moral guideline than anything else…
Styling the web isn’t easy, never has and probably never will be. What it can be better at though, is to meet our expectations when we’ve done what should, per standards, be correct.
“I’ve set the appropriate box sizing and I hereby expect my containers to apply their paddings and borders inside that space. All things appear to look like they should. Wait what, you need support for IE7?”
You don’t have to travel all the way back to 2006 either. We’ve all been there; layouts living their own lives, text appearing off-screen when it shouldn’t or…Internet Explorer. Through vendor prefixes, hacks and exploits that target browser bugs you hope to make them look the same — often they don’t.
Sometimes you’ll just have to settle that that element will have to look like “Comic Sans” in Firefox on Ubuntu, because it’s too much of a hassle to figure out how to make it behave otherwise.
The web hosts a plethora of use cases and cross-browser challenges that come with them. Frameworks like Bootstrap, for instance, don’t mitigate them, but avoid them instead. The more advanced your issue is, the murkier it gets.
Browsers all implement the W3C HTML and CSS standards differently, for reasons unknown. Some even adjust, omit and modify parts that only targets their browsers. When standards are updated, browsers have to play catch-up in order to support them. More often than not, however, developers want to extend these styles with additional properties — a very, very long procedure. It has to be proposed, written, implemented and adopted — it’s not uncommon for this to take years.
Trying to follow all these inroads has caused severe issues when trying to make web pages look and behave the same across user agents. Interactional elements, for instance, are tied to how the user’s OS renders them. It’s not uncommon to conditionally load different stylesheets depending on the browser used.
Outside of browsers doing their own thing when they shouldn’t, you’re always on the “serving end” when styling, just like that bakery we initally visited. As the following illustration shows, only the DOM and CSSOM (which we’ll talk about soon) are the parts we’ve had access to in the rendering pipeline.
Parsing styles are done by browsers through what is known as a string based regex comparator search. This means it looks for a certain structure in your styles, like a property, then a colon, then a value and so on. A very rigid approach that gets complicated very quickly and is hard to debug.
As we’ll quickly see, traversing the realm of CSSOM is hell on earth — in case you don’t already know that. It’s underspecified, inconsistent and lacking features you’d really want.
Style properties used in this way only apply them inline, and only the attributes you’ve specified can be read back. Given that the body has no other styles through CSSOM, if you tried to reading something like body.style.borderColor that would be void — even if the stylesheet may have applied the style to the body.
Figuring out which attributes are understood by what browser and how has and still is an error prone development cycle that has cost brands, companies and solo-developers possibly lightyears of work better spent elsewhere. Outside of telling people not to use Regex for parsing HTML, it’s one of the most commonly asked type of questions on StackOverflow.
CSS need a serious feature update that explain the “magic” of styling and layout on the web and make buggy support of CSS features go away forever — regardless of browsers. It should allow for much faster updates to the language.
This is where Houdini comes in.
Houdini, as the name suggests, tries to break away from old conventions in an attempt to iron out differences and unify how elements’ on the web render.
Conseptualized by the Houdini Task Force, Houdini’s purpose is to jointly develop and unravel the magic of styling and layout on the web. This means access to parts of the browser’s internal workings (rendering pipeline) previously not accessible, through various APIs. Instead of being at the end of the chain, you’re in it. This allows you to control style and layout before the page is even shown.
So with the help from the user, the browser can now serve up exactly what we want and it’ll display the same in every browser implementing that API. No longer does the browser have to rely on their own interpretations on what you’re trying to accomplish — You are the wizard now! Or baker…
Or put simply:
You’re directly telling the browser how to get its act together.
A word of caution first:
Houdini is in active development with many features swirling in the void. Some are only accessible behind a special flags in the browser, while others not at all. We will cover some of the most popular ones, the ones you can try today.
Wait, what? Modify a brower’s rendering engine?
Imagine using any CSS property and knowing that it would work exactly the same in every browser and as performant as if it was native.
While this may sound like witchcraft, it really isn’t. It allows us to normalize cross-browser differences as features developed will look exactly the same no matter the browser (for those browsers that implement it). Remember; this is different because it’s embracing an ideology instead of implementing spesific features. More importantly, it allows us to invent features that can be used today which in turn moves the web forward.
Houdini is backed by some of the most prominent software companies on the planet like Mozilla, Microsoft, Google, Apple and Opera, which should give an indication that they finally realized something needs to be done.
If you remember back at the original illustration on browser rendering pipeline, Houdini unlocks several other of these blocks.
Access to these are through various APIs that expose layers (or portions) of the browser’s rendering engine previously inaccessible to developers. This low-level access provide a tremendous performance gain and eliminates inconsistencies.
You’ve probably already used this
Perhaps surprisingly, you’ve already used something very similar to what Houdini aims to achieve. Libraries like jQuery or the data visualization library D3 already does this for you, and through one identifiable (and readable) API.
The inner workings are accessed through various APIs. These are:
- The Layout API which controls how layout is laid out with CSS.
- The Paint API which controls drawing images wherever CSS allows it.
- The Parser API which gives access to the engine’s parser and methods.
- The Properties and Values API which controls how custom CSS properties and values are understood.
- The Animations API which is an extension of the current animation API and allows for using graphic acceleration for smooth framerates. It allows for linking animations to states (like scroll) instead of fixed timeline.
More on that here.
- TypedOM which organize and structure CSS values instead of parsing simple strings. Finally some order.
- and Font Metrics API which allows for some font hijinks in custom or typographically intensive layouts.
Most of these have been not implemented in any browser yet, but some have. We’ll cover the Paint API and TypedOM in this article, along with something called Worklets . You can see for yourself who has implemented what over at https://ishoudinireadyyet.com.
Similar to web workers, worklets work on seperate threads independently of the main work thread, and are designed to be parallellized. Because of this, worklets must exist in at least two instances. Let’s explain what that actually means.
When worklets are executed from a HTML page they actually run in the background, independetly of user-ui scripts and the main thread. This allows us to utilize multi-core cpus for efficient performance. The rendering engine calls upon them when needed. We do not need to engage them specifically / directly.
From the illustration above:
Once fetched, a worklet is spawned into at least two of the available worklet processes. The rendering engine can then call upon any of the worklets in any of their parallellized (spawned) processes.
The Paint API
Let’s look at the Paint API. As the list stated earlier, it lets you create an image wherever a CSS property expects an image. Be your very own Bob Ross or DaVinci.
So instead of background: url(‘myimage.jpg); you can now use paint(myPaintWorklet) to reference a paint worklet.
At the time of writing you can use it without any special janking in Chrome 55, Opera 52 and Samsung Internet 9.2, with other browsers soon following suit. This is probably the one that’s easiest to dive into, as we’ll see in a bit.
Let’s have a go at making a tooltip, or rather; see an existing example.
You can try the interactive demos for yourself here:
The workflow goes something like this. Add markup like you normally would, but reference the paint worklet you’re attaching. This is done via a script-tag.
While you do not need to completely grasp everything that’s going on here, pay attention to the strings starting with “--”. These are css variables, which can be read directly from the CSS.
Finally, add some styles as a way of modifying the tooltip. The CSS properties prefixed with --, like --background-color are read by the worklet.
Here’s how the tooltip demo looks. Pay attention that we’re just adjusting the CSS variables. The paint worklet does the rest.
There are alot more examples to satisfy your curiosity over at https://css-houdini.rocks
Have a look at this example that modifies borders for a slight “children’s book”-look. Imagine trying to do this with CSS today, even as fluently, ensuring it looks just the right amount of cartoony in all browsers.
Try the demo yourself over at https://css-houdini.rocks/rough-boxes.
(Use one of the browsers that support the Paint API, like Chrome for instance).
Paint worklets have limited scope and functionality. They can’t access the DOM and many global functions (like setInterval) are not available. This helps keep them efficient and potentially multi-threadable.
Another part Houdini aims to leverage, is updating the CSS property landscape. Instead of parsing strings, like we saw earlier in CSSOM, we’re using Typed Object Models. This tries to bring objects to CSS properties.
The new specification, CSS-TypedOM, allows for a new way to interact with underlying values, by representing them with specialized JS Objects that are easier to parse, read and perform than string parsing. The end result is sanity to your CSS work cycle and yields way better performance.
Through the CSS Properties and Values API, parsing the Typed CSS-OM yields results that have a CSSUnitValue with options to things like defaults and fallbacks if a value is missing.
While this is a bit more verbose, it adds error handling and is more efficient than today’s parsing.
These objects are called the CSSStyleValue objects and are the base class of all CSS values accessible via the Typed OM API.
Are we there yet?
The current status of the Houdini project can be seen on https://ishoudinireadyyet.com. Spoiler alert: It’s not ready yet.
Some features are in planning, some are already shipped and some may not be implemented (or not yet considered). The upside though, is that some of the most prominent and wellknown vendors are on the forefront, leading the way to an easier developer experience with more on the way.
For now, the API that most vendors have implemented is the Paint API. At least you can be Bob Ross while you wait for the rest.
The Future of layout and styling on the web
So. Tired of shouting pastry orders over the counter, Houdini let’s you jump over the counter and take spoons in your own ladles. Finally no more raisins — ever.
It’s certainly a big leap from how we’ve styled sites before, but an important check on the sanity list and what everyone’s been yearning for: keeping projects maintainable, expectant and let alone; fun.
During this year’s JSConf EU 2019, the 10th and last I’ll add, Una Kravets helped introduce more developers to the concepts of Houdini. If you’re still on the fence or want to listen to someone talk, have a gander here: