The dangers of returning void — A look at information loss
If you’ve programmed in virtually any mainstream language you might not have considered that there was anything wrong with returning void
. I know I didn’t see what the issue was for a long time. In this post I want to hopefully show why returning void
is almost always the wrong choice and how we can structure our programs to take advantage of the additional information this approach provides.
Some motivation
A tip came across my twitter feed suggesting that if you don’t need to return anything while iterating over an array you could use forEach
instead of map
.
My first impression was that I loved that the forEach
prototype method was used rather than the for
looping structure (also I’m one of the poster’s top fans). But then I started thinking about what it meant to have a function that “didn’t need to return anything”…
What’s in a function that returns void?
Think of every function/method/procedure you’ve written that returns void
. What do they all have in common? They could have mutated state, inserted a row into a database, written a file to disk, or a thousand other things, but all of them performed some sort of side effect. How do I know this about your code that I’ve never seen? Because there’s no purpose calling a function that returns nothing unless it performs some side effect.
If a “function” that returns void must perform a side effect then it isn’t really a function to begin with because a function must be deterministic.
Void doesn’t compose
In the above image from my LambdaConf 2018 presentation on Abstract Algebra, we can use puzzle pieces as a metaphor and say that A, W, X, Y, and Z all return void
. Once the A is placed, we can’t add anything else to the puzzle on that branch. None of the Lego bricks on the right suffer from the void issue and so the ways in which they can be combined are limitless.
Information Loss
Let’s look at our motivating example again and really look at what we’re doing.
What happens to all of that data from fs.writeFile
? What if one of the writes fails? What if we wanted to wait for all of those files to be written before we do something else? Now how about if we wanted to write all of those files in parallel, or sequence? None of those options exist when we drop the information returned from fs.writeFile
, we’re at the mercy of the interpreter to decide what happens.
Typing that data
The first step to keeping the information is to create a type that will contain all of the information. with fs.writeFile
we have a couple different effects happening.
- We have asynchronous processing because the second argument is a callback. We can easily model that with a
Promise<A>
. - We have the concept of success and failure because the callback has both
Error
andSuccess
arguments.Either<L, R>
is a common type to handle potential failure cases. In this definition, theL
is the type that we want to use to signify anError
and theR
is the type we’d like to use to signifySuccess
.
By combining these two types into Promise<Either<L, R>>
we’re able to keep all of the information from fs.writeFile
into a single, generic type. In fact, almost all of the Node callback style functions can be modeled with this type which we’ll see in just a bit.
Also, while it might be tempting to use the
catch
fromPromise<A>
to handle the failure, we wouldn’t have type safety becausePromise<A>
doesn’t allow us to define a type for the error; it only allows us to type the “success” case. Nesting anEither<L, R>
allows us to provide 2 types to maintain type safety. In this wayPromise
is similar to a functor (it can generate 1 distinct type) whileEither
is both a functor and a bifunctor (it can generate 2 distinct types).
Pinpointing the void
So when we look at writeFile
and think it returns void
let’s pinpoint where in our new type that void
exists. If we think about it, it’s really the type that is returned after the asynchronous effect if the effect was successful which makes it the R
in Promise<Either<L, R>>
. Filling in our generics with our actual types (doesn’t this look just like passing values to a function?) we get our concrete type of Promise<Either<NodeJS.ErrnoException, void>>
.
Leveraging fp-ts
It is entirely possible to write our own implementation of Either
but that’s for another post. Also the amazing fp-ts library already has this exact type created for us as a single type rather than a nested one! Let’s see how we can use it to begin preserving information.
Thanks to the taskify
function from fp-ts we can convert a void
returning function with a void
returning callback into a TaskEither<NodeJS.ErrnoException, void>
, no data loss there!
A journey through type simplification
Now lets see how we can preserve our information by using some well known functions to adjust our types to solve the initial need of writing multiple files to disk without losing information. This will take several refactoring steps, but hang in there because the result is far less code than it may seem as you read through.
First we define an interface and a list of sample files and then we map over the files, turning each into a TaskEither
. That leaves us with an array of asynchronous processes though and we’d probably prefer to have an asynchronous process of an array (TaskEither<Array<A>>
rather than Array<TaskEither<A>>
). Once again the fp-ts library has us covered though.
Changing the sequence of our types
The sequence
function swaps the order of our types for us which is exactly what we wanted to do. Now we have a computation that will run all of the tasks in parallel and wrap them all inside a single TaskEither
. If you’ve used Promise.all
then you’ve already seen a very specific version of sequence
that only works for Promise
values in an Array
.
Map + Sequence = Traverse
It turns out that the combination of calling map
and then sequence
is very common and is called traverse
. Switching to traverse
will allow us to simplify our 2 function calls into 1 to produce the same output.
And there you have it. We have now maintained all of our information until the very last bit of our processing (lines 9 and 10). Also, look at how we are now forced to deal with the exception case. There’s no way to get the resulting value in line 10 without first providing the failure function in line 9.
Wrapping up
Because we went through several refactoring steps to get to this point you may think that this approach has created tons more code, but here’s the code in its entirety to prove that we stayed very lean despite this code including lots of additional code that wasn’t there in the initial screenshot.d
Our basic steps were to define a type that retained all of the information and then reorganize our nested types until they were in the order we wanted to process them.