Correct Error Handling is Hard, Part 2

Following my last post “Correct Error Handling is Hard” I had an interesting discussion on Twitter and got interesting comments in the comments section, which I want to use as an opportunity for another post.

In that previous post I introduced a helper type called SafeCloser, which helps to correctly close io.Closer objects in Go exactly once.

In the follow-up discussions it was pointed out to me that os.File can handle multiple closes gracefully and that in fact none of the Go code on GitHub panics when calling Close twice.

For os.File this is true. Right now. In Go version 1.8. We don’t know if that is true for future versions.

And while a lot of code is hosted on Github, what about all those proprietary code repositories hosted within companies? Can you also assume they all handle multiple closes gracefully?

The fact that it is neither guaranteed through io.Closer’s documentation nor in os.File’s, and very likely not through any other I/O object’s documentation, means that you cannot count on it. Yes, it’s unlikely, but in a future version of Go, os.File’s Close() method might panic when called twice.

And just to illustrate that changes like this in new Go versions are in fact not that unlikely, take e.g. net/http’s NewRequest function. This function had a breaking change in Go version 1.8, which was not documented. What happened?

In versions prior to 1.8, when you passed in a bytes.Buffer as body to NewRequest, it would only read that buffer when the request was actually executed. With version 1.8, the buffer was read in the NewRequest call already. It’s a minor change, but it’s enough if your logic depends on the fact, that the buffer is read lazily.

Similar things could happen in os.File Close() or other Close() methods.

“Not a problem”, you could argue, “tests will catch it.” Maybe. Maybe not. The panic could come from a subtle bug which only triggers once in a million. So it easily passes all your tests. Unfortunately, according to Murphy’s Law, “Anything that can go wrong will go wrong.”

Once your production system starts serving 100 requests per second, this problem will arise roughly every 3 hours. Add to this a couple of other issues in your production system, and you will have a lot of work in front of you digging yourself out of that hole. You better have goodies like a log archiver and visualizer and a fast deployment cycle, otherwise this can become a real nightmare.

For io.Closer itself, I wish the documentation would be updated from

The behavior of Close after the first call is undefined. Specific implementations may document their own behavior.

to be more specific. I.e. either say

Callers must not call Close more than once. Otherwise the behavior is undefined, implementations may choose to panic.

or

Implementations of Close must be idempotent, i.e. callers can call it as often as they want. After the first call it must be a no-op.

After all, how does an interface help, when the contract is not the same across different implementations? Unfortunately, making either change would break code, so it’s difficult to introduce.

In my experience it’s better to just avoid any of this trouble and use a construct like SafeCloser or any other that accomplishes the same.

Like what you read? Give Peter Goetz a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.