If you take a look around right now, you’ll find yourself surrounded by some very complex objects. Smartphones, computers, printers, cars, televisions, toasters — the list goes on. But no matter how complex these devices are, you can use them to do what you need, even though you’d have a hard time building any one of them on your own.
This small miracle is thanks to a principle we call abstraction. Abstraction is the design philosophy that replaces a tangled mess of complex details with a tidy interface you can use to GSD (Get Stuff Done). Abstraction is also at work deep in the heart of every software program, straining to hold back the floodwaters of confusion. And every so often, those abstractions start to leak.
In this article, you’ll find out what it means to have a leaky abstraction, why they happen, and whether you — a serious coder — should worry about it.
Let’s introduce abstractions (again)
Don’t you hate when people explain object-oriented programming with cars? Me too. But it turns out that the humble automobile is actually a pretty good metaphor when you need to talk about abstraction.
Here’s how it works. The average car has a simple driving interface made up something like this: a start button, a steering wheel to set direction, and two pedals to control speed. You use the start button (or key) to get started, even though you know nothing about the functioning of the battery starter and the gas combustion engine. (Programmer version: you call the
Car.Start() method.) You use the steering wheel to point yourself in different directions, even though you don’t understand how the steering column is powered or how the wheels are connected. And so on.
The same thing happens with computers. You can visit a website by typing a web address in a box. You don’t need to understand how DNS lookups are performed to find the right website, or how your device performs a TCP handshake with the far-away web server so they can talk. Which is good because those processes are quite detailed.
Over time, we create new abstractions and we add to existing ones. When people first designed cars, the interface didn’t include a speedometer (programmer talk: there was no
Car.CurrentSpeed property). But it turns out that it’s very difficult to get car objects to obey the static
Road.MaxSpeed property without one. So they added it in the next version, and life has never been the same since.
To design something — anything — is to think about creating the right abstraction. A good abstraction is one that exposes all the important details and nothing unnecessary. One that balances control and complexity. One that provides methods and properties that map easily to the tasks you want to perform. If an abstraction is designed well, using it feels straightforward and logical.
With software programming, this process becomes more obvious. If you write code that does something, you need to think about how you hide its inner workings from other bits of code, and how you expose the right functionality. Testing, naming, and design patterns are all part of the gargantuan effort to make good decisions when building abstractions, and keep their effects under control.
Abstractions are why you can write
<button> in an HTML document instead of needing to paint individual pixels. It’s why you can write an SQL query to fetch a customer’s order history without knowing the size of each record of where it’s stored. It’s why you can print a file without understanding printer language, play a video file without understanding video codecs, read a text file without manually jumping from one cluster to another on a hard drive, and store collections of data without managing memory addresses (unless you really want to).
It would be an exaggeration to say that everything good about programming comes from abstraction… but only a small one.
How abstractions leak
Here’s a central truth: All abstractions leak.
This is sometimes called The Law of Leaky Abstractions after it was coined by Stack Overflow co-creator Joel Spolsky. But why does this rule exist? Why can’t we build the perfect abstraction?
The problem is that the value of an abstraction is in the details that it hides. Every good abstraction wants to simplify life, and needs to stuff some details out of sight (and out of the reach of your code). Eventually there are edge cases, special scenarios, new features, or unusual problems that can only be addressed using those hidden details. This moment — when some of the details spill out of the abstraction, and you need to mess with the inner workings — is the leak.
Think back to our car example. You can build a nice interface with a
changeOil() interface. But then the radiator clogs, the engine overheats, and it’s time to wave away the smoke and dig into the car’s inner workings. The car-user interface isn’t adequate for this job. Instead, you need a mechanic who understands the very details this abstraction is hiding.
As with programming, this abstraction failure isn’t a deal-breaker. Most of the time, the car-user interface is perfectly useful and safe. Adding engine-altering methods wouldn’t be a great idea, because it would complicate the operation of the car and give car users more opportunity to accidentally break things. But if you had a car that needed engine repair on a daily basis, you’d probably change your mind and decide to use a different level of abstraction.
If this all sounds too much like the Zen of Car Repair and not so much like programming, consider some of these leaky abstractions in software development:
- Class variables. You assign them like normal values, but try to compare or copy them, and all of a sudden you’re comparing or copying the memory reference, not the object itself. Next thing you know, you need to know about the stack and the heap, the very details you were avoiding.
console.log(0.1+0.34)? One leaky floating-point abstraction. To fix it, you need to learn about floating-point arithmetic, consider rounding, or maybe use a
- Object-Relation Mapping in a database. Use an ORM framework, and you can deal with objects instead of tables, queries, and stored procedures (which are themselves already an abstraction over a data store). But databases are used heavily, performance becomes important, and before long you need fine-grained control that forces you to reach around the ORM abstraction.
- ASP.NET Web Forms. It’s just one example of a server-side framework that tries to make you forget that your code is actually running on another computer (the web server). The problem is that most web apps soon require careful considerations about responsiveness, latency, and page size. When these hit, you’ll need to learn all the gory details about the implementation of your web framework’s execution model, and the difference between client-side and server-side code.
Programmers usually start complaining about leaky abstractions when they leak so often or so dramatically that you really need to understand all the details that are supposedly hidden. In this case, the abstraction is a so-called convenience that isn’t really saving any effort. The real art of programming is recognizing abstractions, navigating their leaks, knowing when and how to patch the gaps
So don’t fear the leak abstraction. Just remember that every abstraction has its place, some are more useful for certain problems than others, and battle-tested programmers gradually develop an instinct that tells them whether an abstraction is likely to cause more problems than it solves. But that’s a conversation for another day.