The art of abstracting software

Docler

Published in

Byborg Engineering

7 min readOct 21, 2020

by João Lopes

Introduction

Joel Spolsky wrote this article in 2002 about The Law of Leaky Abstractions which states:

“All non-trivial abstractions, to some degree, are leaky.”

Eighteen years later and that article looks like it was written yesterday. After all those years and it’s still common to see leaky abstractions, but Joel also give us a reason that help us to understand why this is still happening:

“…the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don’t save us time learning.”

Are developers learning how the abstractions work and what they are abstracting? And if that’s problem, what is best way to learn about abstractions?

Joel’s article is awesome, but its examples are bit a far from web applications reality (of course, it wasn’t his goal to talk about leaky abstractions on web applications) therefore, what are leaky abstractions in web applications and how can developers learn the art of abstracting software?

Interfaces are not abstractions

The Design Patterns book states that developers should “program to an interface, not an implementation”, but what did they mean about interface? It’s important to consider the time when the book was written and published.

The middle of 90`s was basically when Java and PHP started, for example. Companies were investing on desktop applications using C++, Delphi or similar. We can conclude the word “interface” had another meaning back then.

Interfaces are not abstractions, but interfaces can be used to create good abstractions. But wait, why interface are not abstractions?

In this article, Mark Seemann says interfaces are “just language constructs. In essence, it’s just a shape”. Well, he’s right. Programming languages allow developers to put lots of things on interfaces, which make them something else besides abstractions.

IDEs can help developers to create interfaces, but are they abstractions? When the developer has the implementation ready and then he clicks on “Extract Interface” what will be generated by the IDE: an abstraction or an interface (interface as just a construct)?

Before digging a little deeper into the subject, it’s important to discuss a bit about web development and the words and frameworks which are being used by the community.

Web Development, MVC and DDD

Lots of web developers are familiar with MVC and DDD. During the last twenty years, web development has been changing a lot. At the beginning PHP, Java and .Net web development were completely different. Nowadays the main web frameworks are getting closer (Ruby on Rails has its influence).

Taking .Net Core, Java Sprint Boot and PHP Laravel frameworks as examples it’s easy to see lots of things in common between them. While PHP developers have composer, .Net Core developers have NuGet and Java developers have maven. Their tutorials teach how to create a simple feature by using Controllers, Models and Views.

As the years were passing those frameworks included unit testing as part of them to start working with. Most recently they have added some dependency injection tools within the framework. It is plausible to conclude the community thinks managing dependencies is important.

Beyond that developers have started to discuss Domain Driven Design (DDD) and how to design, implement or apply it to their current projects.

Although all the efforts and knowledge shared it is still common to see leaky abstractions everywhere regardless the language or the framework.

Why after all those years and with all the available resources developers are still struggling with abstractions?

Few years ago, you could see some developers creating an interface foreach class. As well said by Martin Fowler here, using interfaces when you aren’t going to have multiple implementations is extra effort to keep everything in sync. Be careful, because his statement is usually misunderstood. It doesn’t mean that you shouldn’t create an abstraction for your repository just because you have only one repository implementation.

Common Leaky Abstractions on Web Applications

What are the most common mistakes when it comes to leaky abstractions on web applications?

The most common situation for creating abstractions are external resources on infrastructure layer, right? Creating abstractions for storages, databases, APIs, queues, caches and others make sense then.

Ok, consider the following examples:

There are more than one hundred thousand examples, but what those interfaces have wrong? What’s the problem with them?

· Ex. 1: the interface name let its client to know about the details of the implementation. The class which uses that interface should never know that the storage is AWS S3. Second problem might be the second parameter, because the File object must not be required by the interface’s implementation.

· Ex. 2: this is one of the most problematic interfaces. First issue is the name. The client doesn’t need to know it’s an API. Second thing is the public function Call which explicitly states to the client that it is an API and it needs to know the endpoints and how the parameters works. In case there is an update in the API (which is external) the test doubles which mocking/stubbing this interface will broke.

· Ex. 3: very common mistake. Some developers create the interface after the implementation, which leads to leaky abstractions. In this example, the interface name is good, however one of the parameters is specific to one Payment Gateway, in this case PayPal. Leaky abstraction.

· Ex. 4: similar example to the previous one, but instead of having an leaky input it contains leaky output.

· Ex. 5: probably the most common one. What is the problem with UserRepositoryInterface? Abstractions must be defined by its client. Having one single client using all those repository functions are rare. This is a classic violation of Interface Segregation Principle and in this case, it’s a Leaky Abstraction.

There is nothing perfect in the world, so why abstractions would be? What’s the problem about having leaky abstractions? Lots of developers are ignoring these rules for years and yet their softwares are still running.

Besides all the business problems leaky abstractions can cause, technically they make the code rigid, hard to change, to test and to evolve. Each change on infrastructure can cause a revolution at the company. A simple thing as changing the notification system from Slack to MS Teams can take months instead of hours.

How to create good abstractions and how can you know if an existing abstraction is good or not?

Mastering the art of abstracting software

According to Mark Seemann in his article “Towards better abstractions” he states

“Being able to implement a meaningful Composite is a good indication of a sound interface”

Composite pattern is indeed an excellent indicator that you have a great abstraction. He also states that he doesn’t use Decorator as an indicator because you can basically decorate any interface. I partially agree with him, because you can use the ability to decorate as an indicator of bad abstractions.

Personally, I believe that eliminating bad abstractions can be a good start, but Mark is right when he says that the ability to decorate can’t be used as a sign for good abstractions. Although creating decorators is easy for any interface, creating meaningful decorators is harder.

I would go beyond that, because structural design patterns can be used to help developers to create good abstractions, but before going deeper into design patterns, let’s start with the basics.

1. Good abstractions usually have few functions: the explanation for this is simple. Since it’s the client responsibility to define the abstraction, it tends to be small. If that’s not the case, you probably have a code smell. TDD also helps to keep interfaces small. Let’s contextualize:

Imagine a service (application layer) for registering a member, but there can’t be two members with the same email. You’ll probably have a repository interface, but from the client point of view:

2. The less inputs the better: interfaces with many inputs tends to break tests easier and makes your object less composable (I’m not talking about only Composite pattern here). Clients have good abstractions when they define simple and clear inputs and outputs, like following example for storage:

Imagine that a process (service, application layer) requires to storage an image:

3. Practicing TDD with stubs constantly makes abstractions easier to comprehend: testing your service (application layer) before writing its implementation is a slow and meandering process especially if you’re just starting as a developer, but it’s not always an easy possibility, since the business needs to be aligned and write logical processes and know what can go wrong.

4. Great abstractions make return types composable: I personally don’t think this an essential ability for good abstractions but having that possibility under the sleeves can be mind-blowing. This basically means that your return object is also an abstraction and it allows you to change/extend it or do anything else with your return type without touching the infrastructure or even moving the logic there.

Be careful, PHP developer! In old PHP versions you couldn’t define a return type for your abstraction. This is dangerous, but now that PHP developers can define a return type, there are lots of abstractions of which return type is “array”. These are leaky abstractions, because it allows the implementation to put any keys, not the expected return type by the client.

5. Good abstractions can be implemented by a Composite object: well, this is one is tricky, because usually dynamic abstractions such as CommandServiceInterface are easier to implement Composite. To implement Composite on a Repository interface might be easy technically (depending on the interface) but it might not make sense. For those cases (first and second examples — repository and storage) I personally prefer to test if my repository abstraction is suitable by another implementation for another library (it can be another ORM, database, storage service…)

Conclusion

Mastering the art of abstracting software requires training, competency, dedication and consistency. All non-trivial abstractions, to some degree, are leaky, as well said by Joel eighteen years ago. Facing trade-offs are normal for every profession and software development is no different. Creating good abstractions is the beginning, because they allow you, as a software developer, to make your decisions faster, easier and efficient.