Abstraction Is Not Simplification

Jon Rimmer
7 min readMar 11, 2018

--

There was a question on Hacker News today about whether productivity was greater in the past, and whether tools that are supposed to save us “time and headache [sic]” have only made things harder.

This got me thinking about those tools. Or, rather, abstractions. Because I consider a tool as a form of abstraction, usually over a process, protocol or format. If it’s true that, after adopting abstractions, our work is still as complicated, or more complicated, than it was before, then have those abstractions failed?

You see some version of this idea levelled as a criticism throughout software development, at all manner of tools, libraries, frameworks, protocols and languages. It usually boils down the following argument:

  1. System Y is an abstraction over system X
  2. Y is as complicated / more complicated than X
  3. Therefore Y is bad / worthless.

I contend that all arguments of this type include an implicit assumption that the purpose of an abstraction is to reduce overall complexity. I would also contend that, at least in software, this is an incorrect assumption. The purpose of abstraction is to hide complexity, but not to reduce complexity. Abstracting systems may introduce as much or more complexity, as they remove, and this is not a bad thing.

The idea that abstraction equals simplification is imparted to software developers early on. We are often a told a story about the evolution of programming languages that goes like this: we began by directly writing machine code. This was difficult and error-prone, so we invented assembly to abstract away the complexity of working with machine code. This was still difficult and error-prone, so we invented compiled languages to abstract things further. And we continued to invent ever more high-level languages to further abstract away the messy and difficult details of what the computer is actually doing.

If we accept the logic of this narrative, then we would expect programming in high-level languages to be always be simper and easier than programming in low-level ones. We would expect faster development cycles, less bugs and better quality software overall. And eventually we would produce languages that are so high-level, and therefore so simple, that non-programmers can use them to produce software just as easily as programmers. This has long been the promise of those selling very high-level and “domain-specific” languages and tools, with mixed results.

Many coders scoff at such sales claims, but those same coders usually still propagate the intuition that higher-level abstractions should be inherently simpler than those systems they abstract. If C or C++ are complicated, that’s OK, because they’re supposed to be complicated. If CSS or JavaScript are complicated, then that’s a defect, because they’re supposed to be easier. If a compiler is complicated, that’s OK, because compiling is supposed to be complicated. If a build system is complicated, then that’s a defect, because they’re supposed to make things simpler.

But how does complexity arise in a system? Let’s consider those two systems, X and Y, from earlier. Imagine that X is the base system. It’s as low-level as it gets. System X will have a set of operations and concepts of a particular size. Over time, as X evolves, the size of this set will likely grow. Meanwhile, the complexity of the tasks people are using it for will also grow. And the people using it will begin to notice certain patterns of usage that re-occur throughout and across tasks.

Eventually, somebody says, “our usage of system X has gotten too complicated, I can produce a new system that abstracts and simplifies it.” They will identify a new set of concepts and operations, each of which maps on to one or more items from the underlying system. These new concepts and operations will be designed to represent those repeated patterns that were observed by heavy users of system X. The new system will be sold to users of system X on the basis that it is both more powerful (as a single one of its operations and concepts can encapsulate many operations and concept from the lower-level system) and easier to use (because it has less operations and concepts to learn):

System Y abstracts the complexity of system X to smaller set of higher-level concepts and operations.

However, we can see that there is no guarantee that this second selling point holds. Or that, if it holds now, it will continue to do so. Depending on how system X’s concepts and operations can be combined, then the set of combinations can be very large, probably infinitely large. The initially small size of Y’s own set can therefore be seen as a consequence of its immaturity, rather than anything to do with the fact that it is an abstraction.

As system Y evolves, it will likely grow in complexity, as most systems do. Its problem domain may drift, so that it no longer exactly overlaps with that of the system it abstracted. New users and use-cases will arise, and the set of operations and concepts will grow. Some of these new things will just be new combinations of X’s operations, while some may be entirely new and unique to system Y.

At some point somebody will likely decide that system Y has grown too complicated. They will identify a new set of concepts and operations and produce a new system Z that abstracts over Y, and the process will begin again.

System Y gains additional concepts and operations, until it is as complicated as the system it abstracts.

This might seem like a rather depressing trend, as systems proliferate and system upon system builds up, with ever more total complexity in the global system of systems. However, it is not necessarily so. Each new system does add more total power. And a single operation or concept in a higher-level system may make possible something that would have been unfeasible if working only with one of the lower level systems. And it is often possible to evolve systems over time to reduce their complexity while maintaining their overall power.

However, the key takeaway should be this: That the complexity of any system, at any level, has nothing to do with the complexity of the systems above or below it, or its position in the hierarchy. A system might present a simplified view of a more complicated system below it, but it might also combine the operations of the underlying system into a far larger and richer set. Moreover, this new, larger set, might be exactly the right one required to solve the kinds of problems that occur at this level, and which this system is designed for.

For example, high-level programming languages include type systems for managing complex data-structures. Many of the concepts and operations that make up the type system are entirely new, not merely a simplification of something that already exists in the lower-level system. And the set of the concepts and operations in the type system can be numerous, and complicated. But whether they are too complicated has nothing to do with their status as an abstraction over machine instructions. Rather, it depends on how useful they are for solving the kinds of programming problems we want to solve using that language.

When evaluating any system, we should guard ourselves against a number of errors and reflexive judgements:

That because a system is an abstraction, it should be simpler overall that the thing it abstracts.

As explained above, abstractions often can and should be more complicated than the thing they abstract. Instead we should ask ourselves: is the complexity it hides from the lower-level system, and the new expressive capability it provides, worth the greater complexity of the new concepts and operations it introduces?

That a complicated abstraction must be an unnecessarily complicated abstraction.

It might be. But it might just be that the problems the system is trying to solve are also complicated, and that the complexity it presents is the amount necessary to provide the expressive power needed to solve them.

It is very common in debates over the merits of different systems to see people criticising the complexity of systems they are only marginally familiar with, without giving due consideration to reasons for that complexity, and what it provides in terms of additional power. It isn’t always the case that the designers of these systems were incompetent, or succumbed to over-engineering. Often, the problems they were trying to solve were difficult, involving subtleties that are not immediately obvious, and we cannot fully judge the success of their solutions without ourselves understanding what those problems were.

That unlearned complexity is worse than learned complexity.

It is very easy to fall into the trap of discounting the complexity of systems we are already use, because we are familiar with it, and to over-estimate the complexity of systems we are not familiar with, because it is novel. This is still an important factor to consider when evaluating the cost of adopting any new system, because learning new stuff requires time and effort that might be better spent elsewhere. But we should always make the effort to evaluate relative complexity in a fair way. We should try to imagine that we are coming to both systems as a new user. Starting fresh, which system would make us more productive?

None of the above is to say that we should let anybody (especially ourselves) off the hook for creating systems that are bloated, opaque, or unnecessarily complicated. Bad design and over-engineering are very real problems in software, and a willingness to call them out when we see them is part of improving the results of our field, which are not always good.

But we should also try to remain objective and humble in our judgements, and to have empathy for those working in problem domains that we don’t fully understand. The problems we are solving with computers are growing more complicated, and will continue to do so. That this will lead to greater complexity in the systems we build, and necessitate the development of ever more abstractions, some of which will be very complicated, is unavoidable. By adopting a rational and non-prejudiced attitude towards these new systems, we can hopefully evaluate them fairly, on their merits, rather than falling prey to the idea that they represent incompetent or bad-faith efforts to make our lives unnecessarily difficult.

--

--