The dangers of returning void — A look at information loss
If you’ve programmed in virtually any mainstream language you might not have considered that there was anything wrong with returning
void. I know I didn’t see what the issue was for a long time. In this post I want to hopefully show why returning
void is almost always the wrong choice and how we can structure our programs to take advantage of the additional information this approach provides.
A tip came across my twitter feed suggesting that if you don’t need to return anything while iterating over an array you could use
forEach instead of
My first impression was that I loved that the
forEach prototype method was used rather than the
for looping structure (also I’m one of the poster’s top fans). But then I started thinking about what it meant to have a function that “didn’t need to return anything”…
What’s in a function that returns void?
Think of every function/method/procedure you’ve written that returns
void. What do they all have in common? They could have mutated state, inserted a row into a database, written a file to disk, or a thousand other things, but all of them performed some sort of side effect. How do I know this about your code that I’ve never seen? Because there’s no purpose calling a function that returns nothing unless it performs some side effect.
If a “function” that returns void must perform a side effect then it isn’t really a function to begin with because a function must be deterministic.
Void doesn’t compose
In the above image from my LambdaConf 2018 presentation on Abstract Algebra, we can use puzzle pieces as a metaphor and say that A, W, X, Y, and Z all return
void. Once the A is placed, we can’t add anything else to the puzzle on that branch. None of the Lego bricks on the right suffer from the void issue and so the ways in which they can be combined are limitless.
Let’s look at our motivating example again and really look at what we’re doing.
What happens to all of that data from
fs.writeFile? What if one of the writes fails? What if we wanted to wait for all of those files to be written before we do something else? Now how about if we wanted to write all of those files in parallel, or sequence? None of those options exist when we drop the information returned from
fs.writeFile, we’re at the mercy of the interpreter to decide what happens.
Typing that data
The first step to keeping the information is to create a type that will contain all of the information. with
fs.writeFile we have a couple different effects happening.
- We have asynchronous processing because the second argument is a callback. We can easily model that with a
- We have the concept of success and failure because the callback has both
Either<L, R>is a common type to handle potential failure cases. In this definition, the
Lis the type that we want to use to signify an
Ris the type we’d like to use to signify
By combining these two types into
Promise<Either<L, R>> we’re able to keep all of the information from
fs.writeFile into a single, generic type. In fact, almost all of the Node callback style functions can be modeled with this type which we’ll see in just a bit.
Also, while it might be tempting to use the
Promise<A>to handle the failure, we wouldn’t have type safety because
Promise<A>doesn’t allow us to define a type for the error; it only allows us to type the “success” case. Nesting an
Either<L, R>allows us to provide 2 types to maintain type safety. In this way
Promiseis similar to a functor (it can generate 1 distinct type) while
Eitheris both a functor and a bifunctor (it can generate 2 distinct types).
Pinpointing the void
So when we look at
writeFile and think it returns
void let’s pinpoint where in our new type that
void exists. If we think about it, it’s really the type that is returned after the asynchronous effect if the effect was successful which makes it the
Promise<Either<L, R>>. Filling in our generics with our actual types (doesn’t this look just like passing values to a function?) we get our concrete type of
It is entirely possible to write our own implementation of
Either but that’s for another post. Also the amazing fp-ts library already has this exact type created for us as a single type rather than a nested one! Let’s see how we can use it to begin preserving information.
Thanks to the
taskify function from fp-ts we can convert a
void returning function with a
void returning callback into a
TaskEither<NodeJS.ErrnoException, void>, no data loss there!
A journey through type simplification
Now lets see how we can preserve our information by using some well known functions to adjust our types to solve the initial need of writing multiple files to disk without losing information. This will take several refactoring steps, but hang in there because the result is far less code than it may seem as you read through.
First we define an interface and a list of sample files and then we map over the files, turning each into a
TaskEither. That leaves us with an array of asynchronous processes though and we’d probably prefer to have an asynchronous process of an array (
TaskEither<Array<A>> rather than
Array<TaskEither<A>>). Once again the fp-ts library has us covered though.
Changing the sequence of our types
sequence function swaps the order of our types for us which is exactly what we wanted to do. Now we have a computation that will run all of the tasks in parallel and wrap them all inside a single
TaskEither. If you’ve used
Promise.all then you’ve already seen a very specific version of
sequence that only works for
Promise values in an
Map + Sequence = Traverse
It turns out that the combination of calling
map and then
sequence is very common and is called
traverse. Switching to
traverse will allow us to simplify our 2 function calls into 1 to produce the same output.
And there you have it. We have now maintained all of our information until the very last bit of our processing (lines 9 and 10). Also, look at how we are now forced to deal with the exception case. There’s no way to get the resulting value in line 10 without first providing the failure function in line 9.
Because we went through several refactoring steps to get to this point you may think that this approach has created tons more code, but here’s the code in its entirety to prove that we stayed very lean despite this code including lots of additional code that wasn’t there in the initial screenshot.d
Our basic steps were to define a type that retained all of the information and then reorganize our nested types until they were in the order we wanted to process them.
And now for our next trick, we'll look at traversals. We'll watch types soar over one another as if they were trapeze…mostly-adequate.gitbooks.io
It's Traversable Monday™, everyone! Granted, tomorrow would have made for a catchier opening, but I wasn't thinking…www.tomharding.me