The Mutability Tax

David Morgan
9 min readJul 15, 2019

--

Software engineering is hard, and I’m lazy. I strongly dislike unnecessary work. So, if code is going to be around for a while, I build it to be easy to maintain.

I work in object oriented languages, usually Dart, and a key consideration in such languages is whether to make data mutable or immutable. Many, myself included, recommend immutable data; see for example Effective Java item 15, “minimize mutability”.

Here’s a snippet of Dart code that illustrates why mutability can be problematic:

// Biggest first.
var cities = ['Tokyo', 'Delhi', 'Shanghai', 'Mumbai', 'Beijing'];
print('Biggest cities of the world in alphabetical order:\n');
DisplayAlphabetically().display(cities);
print('And the biggest is ${cities.first}.');

And here’s what it prints:

Biggest cities in the world in alphabetical order:
Beijing, Delhi, Mumbai, Shanghai, Tokyo
And the biggest is Beijing.

— whoops! The biggest city is Tokyo, not Beijing. What went wrong? DisplayAlphabetically is at fault; it mutates the list it’s given, sorting it:

void display(List<String> strings) {
(strings..sort()).forEach(print);
}

But this isn’t where the real problem lies. It is a given in software engineering that wherever it’s possible to make mistakes, mistakes will be made. The problem is that by passing a mutable list to thedisplay method — which might have been written by someone else, or by ourselves on a bad day, or copied and pasted from code with different constraints — we create an opportunity for bugs.

To fix the problem, we can use the immutableBuiltList:

// Biggest first.
var cities = BuiltList.of(
['Tokyo', 'Delhi', 'Shanghai', 'Mumbai', 'Beijing']);
print('Biggest cities of the world in alphabetical order:\n');
DisplayAlphabetically().display(cities);
print('And the biggest is ${cities.first}.');

— and now we know DisplayAlphabetically can’t modify cities, ruling out this whole class of bugs.

This is a toy example, and easy to debug and fix. Real world code is far more complex, and the problems far more serious. In particular, I see problems with:

  • UI code, which usually consists of loosely coupled modules that share data. Mutability creates opportunities for severe bugs that are hard to reason about, entailing significant maintenance work.
  • High performance server code, which uses loosely coupled modules that share data to allow parallelism. This is a very different environment, but the problem of mutability is the same: severe bugs that are hard to reason about, and significant maintenance work.

Mutable data in these languages is the default, and works well enough for simple cases; but it becomes more of a problem as the codebase grows. So it’s a very common trap to fall into. In summary:

Systems having loosely coupled modules that pass mutable data between them pay the mutability tax. They rely on convention and luck to avoid severe, hard to reason about bugs due to unwanted side effects of mutation. As these systems get larger and more complex, their luck runs out, and maintenance cost increases.

Obviously, if mutability is the problem then immutability is the answer! But it comes with its own problems.

The Immutability Tax

Most object oriented languages are, unfortunately, not designed with immutability in mind. The obvious route is to build your own conventions and libraries for immutable data, and there are surprisingly many pitfalls.

Let’s dig into those pitfalls. They’re similar across all object oriented languages, but for the sake of this article, we’ll consider Dart. The simplest way to have immutable data in Dart goes like this:

class Customer {
final String name;
final int age;
Customer({this.name, this.age});
}

The fields name and age, everyone can agree on. But we’ve already had to choose between using positional parameters or named parameters. As the parameter list grows you’re likely to want named parameters for readability, so the convention we’ve picked for the example is to use them from the start.

The next problem is how to create new immutable instances based on existing ones — how to “update” them. The default, and so what we often see in practice, is to force people to use the constructor:

var customer = Customer(name: 'John Smith', age: 34);
var updatedCustomer = Customer(
name: customer.name, age: customer.age + 1);

Unfortunately, code using the constructor this way breaks if we add a field:

var customer = Customer(
name: 'John Smith', age: 34, visits: 12);
// ... later on ... whoops! This doesn't work, resets 'visits'.
var updatedCustomer = Customer(
name: customer.name, age: customer.age + 1);

We could get around that by making all the parameters required, either as positional parameters or using the @required annotation, but this is hardly better: now every call to the constructor in the codebase needs updating to add a field.

What do we do? We can’t use the constructor, we need a method:

class Customer {
final String name;
final int age;
Customer({this.name, this.age});
Customer copyWith({String name, String age}) =>
Customer(name: name ?? this.name, age: age ?? this.age);
}

So now we can write:

var updatedCustomer = customer.copyWith(age: customer.age + 1);

— and this code continues to do the right thing if a field is added.

Unfortunately, as soon as any field is allowed to be null, this pattern fails in Dart. Optional named parameters without an explicit default just default to null, and there is no way to check if a null was explicitly passed or was received because nothing was specified:

// Doesn't work! The method can't tell that 'null' was explicitly
// passed for 'visits', so nothing is updated.
var customerWithoutVisits = customer.copyWith(visits: null);

Nullable fields can be fixed by using one with method per field:

class Customer{
final String name;
final int age;
final int visits;
Customer({this.name, this.age, this.visits});
Customer withName(String name)
=> Customer(name: name, age: age, visits: visits);
Customer withAge(int age)
=> Customer(name: name, age: age, visits: visits);
Customer withVisits(int visits)
=> Customer(name: name, age: age, visits: visits);
}

— which at least works, but is far from satisfactory. It’s a lot of boilerplate to write; the boilerplate is easy to get wrong; and it’s slow: updating multiple fields requires one full object copy per field.

Even with primitive types, manually maintaining immutable objects in Dart is no fun at all. This example has three fields; in any real system Customer likely has ten or more. That means at least ten with methods, each passing at least ten arguments back to the constructor. Ouch.

But the worst is still to come.

You’ll need to support collections and nested types:

class ShoppingBasket {
final Customer customer;
final List<Item> items;
final List<Offer> offers;
ShoppingBasket(
this.customer,
Iterable<Item> items,
Iterable<Offer> offers)
// Copy defensively to ensure immutability.
: this.items = List.unmodifiable(items),
this.offers = List.unmodifiable(offers);
// TODO(davidmorgan): add "with" method per field.
}

In order to accept Dart’s mutable collections into our immutable world, we have to copy them defensively. One of the supposed advantages of immutable types is that they’re fast. We’ve managed to make them slow.

And, if we go with the simple with methods we had before, updating a nested field becomes a chore:

var updatedBasket = basket
.withCustomer(basket.customer.withName(updatedName))
.withItems([...basket.items, newItem, newItem2]);

We could fix this with additional convenience methods:

var updatedBasket = basket
.withCustomerName(updatedName)
.addItem(newItem)
.addItem(item2);

— but that would mean even more boilerplate. And, we’d again be making every single update do a full copy, making our immutable data slow.

The only way to address all these issues — to provide immutable data in Dart that’s easy and fast to “update”, that supports null fields, and that does not break existing code when fields are added — is the builder pattern. Each data type has an additional associated type, its builder, which has the same data but is mutable. You use a builder to “build” an immutable instance.

We also need a new library of builder-based collection classes. This allows immutable collections to be managed without unnecessary copying, so they’re fast.

By using nested builder classes and collection builders, our example code can be both convenient and efficient, and “updates” can look like this:

var updatedBasket = basket.rebuild((b) => b
..customer.name = updatedName
..items.addAll([newItem, item2]));

Unfortunately (again!), the builder pattern needs still more boilerplate than any example so far, and is extremely hard to get right. There are many versions, many possible conventions, and many possible mistakes to make. You might use the builder pattern but still not correctly support nullable fields; or still have poor performance; or write buggy boilerplate. Such bugs could be as severe as accidental mutability — which is of course worse than the known mutability you had before switching to immutable data!

This bears repeating: writing thousands upon thousands of lines of boilerplate by hand in order to avoid paying the mutability tax leaves you open to exactly the same type of bug you were trying to avoid, as soon as you make a mistake in the boilerplate.

The right way to do immutability is builders, but writing builders by hand is just not worth it.

Immutability Via Codegen

And so where we end up is that in object oriented languages that are not designed for immutability, the correct way to achieve immutability and avoid paying the mutability tax is to let someone else do the work for you, via a library that generates code. Only codegen can reduce the otherwise overwhelming overhead of immutable data in these languages.

Unfortunately (one last time!), codegen does come with its own downsides: it still needs some boilerplate, to get the codegen working; it might take additional work to get your project to build; and your IDE might be less helpful when it hits generated code. But, it’s the best we can do today. In Dart, there is my own library, built_value, which goes further and can also serialize your data; I wrote an article about it here. It looks like this, with the boilerplate highlighted in bold:

abstract class Customer implements Built<Customer, CustomerBuilder> {
String get name;
int get age;
@nullable
int get visits;
factory Customer(void Function(CustomerBuilder) updates) =
_$Customer;
Customer._();

}

The boilerplate is unpleasant and scary, but it’s easy to maintain: you don’t need to add any boilerplate when you add a new field, and it’s impossible to have bugs in the boilerplate because the codegen also checks the boilerplate for you. We have achieved maintainable immutable data. What we’re now paying is less an ongoing tax and more of an entry fee.

In Java there is AutoValue.Builder, which takes a very similar approach and comes with similar boilerplate. Incidentally, that team wrote a nice slide deck about why this kind of codegen is necessary.

In summary:

In languages not designed for immutability, systems that avoid the mutability tax by implementing immutable data structures by hand, instead pay the immutability tax. Immutable data structures in these languages are hard to get right, and by default inconvenient and slow; this leads to increased maintenance cost, bugs, and performance problems. The correct way to avoid the mutability tax in these languages is to use a library that generates and hides the boilerplate needed for fast, convenient immutable data.

Final Thoughts

You might reasonably ask whether I practice what I preach; whether I actually go to the additional trouble of using codegen for immutable data in my everyday work.

The answer is: if I’m going to have to maintain the code, yes, I use immutable data. If it’s small and “write once, run once, delete”, I probably won’t bother.

There was a recent exception to this rule. I was working on code deep in the Dart build; and you can’t easily use codegen in code that your codegen depends on. So, I fell back on hand maintained immutable data.

I was too lazy to write with methods; I just used the constructor. Then I added a field. Then, I spent considerable time debugging because existing code was dropping the value of the new field, exactly as described in this article.

This reminded me of just how much I dislike both the mutability tax and the immutability tax, and the idea for this article was born.

Let’s finish by pulling out the two summaries so they can live side by side.

The Mutability Tax

Systems having loosely coupled modules that pass mutable data between them pay the mutability tax. They rely on convention and luck to avoid severe, hard to reason about bugs due to unwanted side effects of mutation. As these systems get larger and more complex, their luck runs out, and maintenance cost increases.

The Immutability Tax

In languages not designed for immutability, systems that avoid the mutability tax by implementing immutable data structures by hand, instead pay the immutability tax. Immutable data structures in these languages are hard to get right, and by default inconvenient and slow; this leads to increased maintenance cost, bugs, and performance problems. The correct way to avoid the mutability tax in these languages is to use a library that generates and hides the boilerplate needed for fast, convenient immutable data.

Mutable data leads to maintenance problems. In languages not designed for immutability, such as Java and Dart, hand-maintained immutable data brings yet more problems. Codegen is the best answer we have today. It provides the boilerplate needed for bug-free, efficient, convenient immutable data.

--

--