Image for post
Image for post
Photo by Jens Johnsson on Unsplash

Option — the null of our times

Adam Gajek
Sep 30 · 8 min read

Things I would have told my younger-self about using Option in Scala

When I used to code in Java many years ago, there was a rule to examine any input in methods to check whether they weren’t referencing to null. We knew it was somewhere, but we were not aware where, so the only way is to check everything.

In Scala, there is an Option — the data structure which allows us to model the fact of lack of some data in a more explicit, and less boilerplate way.

But as Uncle Ben (Watch out! Not Bob!) once said — with the great power comes the great responsibility.

In that article, I want to share my experiences about common issues with Option I’ve experienced mostly in recent two years when working with about 10 years old system written in Scala. During that, I’ll add how I think we can protect ourselves to avoid all those problems in the future and be more responsible developers.

For many of you those problems may seem unrealistic, and examples trivial but I think these situations in some degree are present and common even in younger projects.

Attaching implicit domain meaning to None

One of the benefits of decomposing code into a set of functions is the ability to hide their implementation details. We should be able to know what behaviour is implemented inside a function without taking a look at its body every time we spot its invocation.

By creating a function which accepts Option as parameters, even handling None properly we are in danger of the situation as below.

val personWithAccount = person.map(generateEmailAccount)persist(personWithAccount, id)
def persist(person: Option[Person], id: Id): Option[Person] = {val p = person.getOrElse(createPerson(id))validate(p).map(insertIntoDb)}

There is an object of type Person, we pass it to persist where in case of None a new object of that class is created, validation takes place, and then insert operation is executed.

Then a few months later, someone will notice we have to add additional logic. Let’s say we want to create an e-mail account only for an adult person. So our code now is like that

Because of that change, the possibility of bug arose because in persist we try to create a new object representing that particular person, but in fact, that object exists already and we end up with duplication of data as validatewas not prepared for the situation of handling fresh new None value that has been just introduced into our domain space. Hopefully, our DB has some ability to detect and disallow that violation.

This example may be dumb or not realistic but its purpose is only to show the mechanism which is repeatedly occurring in our code and which is much more complicated in real-life situations.

The problem here was that we allowed Option propagate through different abstraction layers but we cannot control how None was interpreted in different places.

Initially, it was meant to indicate a lack of some data in our system, but then the same None object in that special context indicates that a person is not adult yet. So in the whole scope, one object — None has two meanings. One is the object of type Person which does not exist, second brings us information that this object was found but the data inside does not match some criteria. How to tell a difference at a certain point how it should be interpreted? I don’t know to be honest.

The way to overcome that issue is to either transform Option as soon as possible to some domain-related data structure which will explicitly encode the fact what did it mean that we got None or use Either which brings value about the nature of the problem (usually provided as Left). In that way, we will avoid situations where it’s not clear what that None means.

If you don’t want to introduce new classes yet, then the alternative way is to limit that propagation by accepting only concrete objects in our functions.

def persist(person: Person, id: Id): Option[Person] = {  validate(person)  .map(insertIntoDb)}

Option in classes

Initially, the source of Option was a get method of Map. Then it spread to the Repository’s methods. Those two use cases are natural and fine, we lack something — we get feedback and decide how to deal with it. But the real outbreak of Option problems were libraries used for processing JSON documents.

All that libraries in Scala allow to automatically provide serialization and deserialization of data when we use case class as the blueprint for deserialization or the source of data passed to a serializer. That’s very handy but unfortunately comes with a cost.

Option and separation of concerns

Recently in my job, I worked on a piece of code responsible for holding delivery of an order and then releasing it when it is ready to go. This feature was added years after the system became alive so the initial domain model does not contain any information about that. It was added, as you may suspect — as an Option. To not bother with migrations and anything. Easy win.

In the business world the orders can be either on hold or approved, go — no go, even on UI that was implemented as a toggle. The problem was in our code we have three states: None, Some(hold), Some(approved).

And again your suspicions are correct, None did not always mean: “lack of value because someone didn’t bother to separate technical concern of schema evolution so just has put it directly into domain object”. It meant: “if you have approved what was None then do not send notification about that”, or in a different place, it was even forbidden to approve order which has None as its current approval value.

As you see, you cannot control how None will be interpreted. It will cross your lines as in case of null or exception and condemn you to live in fear.

What I’d advice for my younger self in that situation? As mentioned before: take care of a good-old separation of concerns. If you don’t want to write custom deserializers then create a separate class to handle that which will act as a blueprint of data. Then transform all unnecessary Option to your domain meaning values. In that case, it would be really enough to use an approved state as an initial value.

Option and anaemia

Another example of the great responsibility lying on our shoulders when dealing with Option is to never allow into such code snippets as below

What’s wrong with that?

The good thing is that in fact users sometimes do not want to provide all their data so what they send we have to use and Noneing absent data seems to be a good way of capturing data from the clients of our code

But the problem I see here is similar to the one created by incorporating well known DDD antipattern — anaemic domain model. We end up as just data tubes which transfer user input to the database.

The practical consequence of that situation is often you are not sure what actions you should take when receiving that, the only thing most of the people do here is just updated Person’s object with the fields they receive and rest leave untouched.

The funny thing is that I’ve seen an approach where None in such case would mean — wipe data for fields you received as empty. Out of control, implicit meaning strikes us again.

In reality, we have a very useful code smell here which says — you divided your responsibilities between client and server poorly, refine yourself!

Let’s say our client is a frontend app. Frontend responsibility to capture user actions with data they provide and then communicate that to the backend. In that scenario, it could be distilled what was updated and instead of pushing one big (very convenient though) object when anything can be present or not it can be broken down into

case class UserEmailUpdated(id: ID, email: String)

Of course, we could distil that information as well, e.g. like that

But in that way, we are just working around a real problem that will exist and rotten your system. What’s more — we’re adding another implicit layer of assumptions what the author had on his mind (maybe, in fact, this email was deleted).

Imperative programming

One of the strengths of FP is that object at every stage of its lifetime is correct, initialized and complete, this is one of the components that all sum up into the term of local reasoning. What we have is done, and we don’t need to worry about any additional work should be done yet.

When we use mutable data structures, we allow ourselves and our colleagues to create partial objects that are initialized in multiple steps and filled with data during code execution.

Fortunately in Scala people try to do FP so by default we assume that we are safe. Unfortunately here the truth is not always aligned with our beliefs as you might expect — because of improper use of Option, again.

I’ve seen a lot of case classes where some values were Options just only because at the stage of initialization we didn’t have all the required data to fill it completely so some fields were left None and then at some stage initialized with proper data (or not). In that scenario, I’d told my younger self: when you cannot initialize your object at once then consider splitting it into two separate entities.

Additionally, do you feel the temptation to somehow use that fact and bind additional meaning to that place? A good candidate would be for example to check, let’s say, deliveryTime and when None then assume in our code that the Order was not delivered yet.

Conclusions

With all of the above we see that in general, we have one root cause of all problems with Option attaching implicit meaning to the fact of the existence of the data in it. Making any assumptions on that is that same as we would assign some meaning to the fact of the size of the collection of data we received from somewhere.

Those observations lead to several actions to be taken when dealing with options — and the most of them is to always as soon as possible transform Option your receive from somewhere to any data structure that is meaningful in the domain. We should just treat Option as another primitive in our programming language toolkit, use it at some level of abstraction but never crop up to your domain level where everything can matter.

Sooner or later someone from your team, your successors or even older you will start to add domain meaning to the lack of value, and this meaning will be implicit and forgotten in a few months in future.

VirtusLab

Virtus Lab company blog

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store