How to write useful exceptions

The function of exceptions is to mark an exceptional state of the program. Generally, when a method finds something that it’s unable to handle (e.g. wrong abstraction level, not his responsibility, any state outside his weakest precondition), it throws an exception. Development should focus on the ‘happy path’ of the program. Handling edge cases rarely produces value.

When a developer finds himself in a situation far from the ‘happy path’, he will write:

throw new Error(“WTF”);

The problem with exceptions

When your app dies, whispering a stack trace with its last breath, you’ll try to answer the following questions:
 1. What happened?
 2. Was it my fault?
 3. How can I fix it?

You skim the stack trace, read the exception messages, search for the frames representing your app, try to come up with an idea how this crash happened, and then, at last, figure out how can you avoid it. Ideally, it goes like this:

> “Hello world”.charAt(20);
java.lang.StringIndexOutOfBoundsException: String index out of range: 20
at java.lang.String.charAt(String.java:646)

And you say “Oh sure, that string can be shorter than 20 characters!”. You handle it somehow and all’s well that ends well. However, in most cases this process is not this straightforward.

  • Meaningless exception message
    Many times you get an abrupt message such as ‘Operation failed!’. A message like this has zero useful information and is not helpful in any situation.
  • Wrong abstraction level
    This happens when a library does not handle an exception in its code, leaking the abstraction layer. Interpreting these exceptions makes you dig deep into the library’s code, derailing your own development.
  • No context
    Imagine a setup, where some part of your app is responsible for the initialization of a component and another part is using it. For example when you create a database connection and later you want to run a query. 
    Something fails at the initialization and the table you want to query is missing. At this point, when your app tries to run that query, it will fail and give you a message like ‘Table xy is missing!’. That’s not totally useless, but it doesn’t get to the root of the problem. Ideally it should tell you why that table is missing, but that information was available only during the initialization, and now, when the query is failing, it’s impossible to tell the cause.

Why it is hard to handle exceptions right

There are two main pitfalls that render stack traces a cumbersome resource for bugfixing.

First, development focuses on the ‘happy path’ and proper error handling has low priority. Writing a meaningful exception message, having your user in mind and thinking about this exceptional situation on the right abstraction level require significant cognitive effort, because it is a heavy context switch from the lower level coding.

Second, even if the developer takes the time to refine their error handling, writing the solution in the exception message might not be possible during development. Take the following example that happened to us:

We were playing with a new feature and reused an older virtual machine to mirror the production server. We started our application, but it failed with this error:

java.lang.NullPointerException: null
at org.elasticsearch.transport.netty.MessageChannelHandler.handleException(MessageChannelHandler.java:206) ~[elasticsearch-2.1.0.jar:2.1.0]
at org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:201) ~[elasticsearch-2.1.0.jar:2.1.0]
at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:136) ~[elasticsearch-2.1.0.jar:2.1.0]
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.10.5.Final.jar:na]
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.10.5.Final.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]

You are right, this stack trace alone will not guide anyone to the solution because there are many things that could go wrong. For example:

  • elasticsearch is not running,
  • we mistyped an entry in the config,
  • there is a bug in the client or something else?

How Samebug fixes the situation

What would have been a good exception message in the previous case?

Something like ‘Make sure the elasticsearch server is of version 2.0.0 or above!’ would have been great, but really, making a library this safe is not expected of any developer. Moreover, it’s not the responsibility of the library to check this. On the other hand, when you find an exception, you want to find the shortest path from the stack trace to the solution.

This is what Samebug aims to do.

The stack trace itself is too verbose, yet often lacks useful information and many times the actual meaning of an exception is not realized during development, but only later, while using the library. Changing the exception message would require an iteration in the product lifecycle. Before that could even happen, a bunch of people have already seen that stack trace, and more likely has some kind of solution to the problem.
Samebug wants to connect these people and let them write short tips, possible solutions or workarounds related to an exception. In the future, when a Samebug user gets an exception, he/she doesn’t have to waste time by digging deep in the stack trace, but only read a few short, Twitter-style solutions.

Conclusion

Exception handling is not in focus during development and exception messages are not always the best way to communicate the solution of that exception. Samebug will separate these two concerns, developing and using library code, making debugging easier and faster.

Show your support

Clapping shows how much you appreciated Samebug’s story.