This post is written as a follow-up on one of my Facebook conversations. Exception handling is one of topics almost everyone understands differently. But any code we write is somehow relying on exception handling, so understanding all the details of this process is important — that’s why I finally made myself to write it :)
The most common questions on exception handling:
- Which exceptions a particular piece of code (or method call) throws?
- Which of these must be explicitly handled?
- Are there some exceptions that might be thrown on almost any call?
- When should I use “try-catch-rethrow”, and when — “try-finally”?
- Which language keywords somehow use exception handling?
- To throw or not to throw?
- When a custom exception type is needed?
- And finally, what’s the reasoning behind all these rules?
I’ll try to answer all these questions, so let’s start.
What exceptions a particular method call may throw?
Short answer is: it’s better for you to assume it can throw anything — no matter what you call. What can be thrown is not important — you should care only about what you can handle.
- Even if you know a precise list of exceptions it throws now, in future its own implementation, or an implementation of one of its dependencies, may change.
- If you call a virtual method (i.e. interface method as well), you don’t know which of its implementations is going to be invoked.
- Almost any piece of code can throw OutOfMemoryException, StackOverflowException, ThreadAbordException, etc.
Of course, if you call a very simple method, such as int.Parse(…), you can more or less safely assume that it throws only X + a set of exceptions that could be thrown by any code. But you shouldn’t make any assumptions for even moderately complex methods you call.
Also note that interface / virtual method calls are very frequent in apps relying on Inversion of Control / Dependency Injection (in other words, in any apps built by experienced engineers): since the actual implementations of dependencies are injected into the constructor / properties, almost any class you write uses interfaces of these dependencies — so you never know the exact implementation.
All of this sounds crazy, right? I mean, how am I even supposed to write a code if any piece of code I write can throw anything?
The good side is: there are other rules that actually make all of this easy and logical. Let’s look at one of them:
Which exceptions must be explicitly handled?
First, let’s define what “to handle an exception” means. Actually, there are just 3 possible cases you should keep in mind:
- You want to catch and suppress an exception
- You want a part of your code to perform a set of actions independently on whether an exception was thrown or not
- You want to either mask an exception, i.e catch it and throw another one, or catch and rethrow it.
If your case doesn’t fall into one of these 3 categories, most likely you’re doing something wrong: other possible scenarios (e.g. when you need to take an action only for a certain kind of exception but still rethrow it) are necessary in maybe just 0.1% of all the cases.
Now let’s look closer on these 3 patterns.
1. Catch-and-suppress blocks
All these cases are similar to this one:
As you can see, we look for specific errors here (FormatException, OverflowException).
How do we know which errors we should look for?
- Either from the documentation — in this case, from https://msdn.microsoft.com/en-us/library/b3h1hf19(v=vs.110).aspx
- Or from the source code. If you use Visual Studio with ReSharper or Project Rider, it’s a single Ctrl-Click, even if you don’t have the source. Otherwise you can use dotPeek (free decompiler for .NET)— it’s a bit slower there, but still quite fast (Go To Anywhere and type the name).
Note that there is a 3rd kind of exception that can be thrown by int.Parse:
- ArgumentNullException: we don’t handle it, since if it’s the case, most likely there is a mistake in caller’s code (it passes us null instead of a valid string), and thus he deserves to get the ground truth.
So when do you need a “catch” block? The key question you should ask is: can you do anything meaningful to gracefully handle some of possible errors in this specific call?
- If you can, you should check out what kinds of exceptions can be thrown there
- If no, you shouldn’t do anything.
Also note that the example I provided here is actually an example of wrong use of int.Parse: this method has a sibling allowing to us write the same code as:
TryParse returns true when parse succeeds, and false otherwise. So in fact, MyIntParse is redundant.
And here we come to another interesting question: why there are two “flavors”of int.Parse? Or in other words, if I have to write a method that throws some exceptions — when do I need to provide a similar overload that doesn’t throw an exception?
This naturally leads us to…
The cost of throwing an exception
You should know two points here:
- Throwing an exception is extremely expensive operation in most of statically typed languages.
- On contrary, try-catch-finally itself is extremely cheap — e.g. in C# its cost is close to zero.
You can try it online here. The output of this code:
As you see, try-catch version is ~ 150 … 300x slower when there is an error, otherwise it’s similarly fast. In other words, you can throw and catch only about 90K exceptions per second in a single thread on CoreCLR, and only about 50K / second — on “vanilla” .NET.
Presence of try-catch doesn’t add anything to the overall time. That’s expected: try-catch-finally only leads to a bit different layout of a code generated by JIT compliler (there are more jmp instructions), but it’s still the same code. The addresses of “catch” and “finally” blocks to process are identified only when an exception is thrown — when a stack walk happens (note that it is anyway necessary to capture the stack trace). A return address on each stack frame is enough to identify the address of corresponding exception handling code, and .NET runtime actually has this mapping.
As a result, the version with exception handling is the fastest one — mainly, because it doesn’t allocate as much on call stack as others do, though in this case the difference is tiny: TryParse accepts a single extra pointer. But as you see, this 1ms (~ 2%) difference is still measurable.
And that’s a nice example of why exception handling is good, and how it’s supposed to be used: all “try” blocks add zero overhead in case of normal execution flow. So if you throw exceptions only in truly exceptional cases, your performance won’t suffer at all. Based on above measurements, if you throw ~5000 exceptions per second per thread, you should expect ~1% performance degrade. In reality, you should minimize the frequency of throws to a bare minimum, and later we’ll discuss what are more precise criteria of when an exception has to be thrown.
The cost of throwing an exception in dynamically typed languages
Interestingly that in these languages it is typically very different:
- Throwing an exception is nearly as cheap as returning from the function. That’s because under the hood exceptions are implemented as a special result type. Stack walk there is still required, but since stack in many of such languages is stored in heap (each stack frame is ~ a dictionary storing locals + return address), this is relatively cheap as well, because no complex reflection / RTTI is needed to figure out what’s the method associated with each of these stack frames.
- All of this comes at a cost of slower execution — basically, all the costs of exception handling are paid upfont. In particular, each try-catch there incurs a measurable cost, and method calls are much slower too.
All that I wrote here is mainly based on my past experience with CPython. One of pretty unexpected consequences of such a behavior there is that all generators (~ enumerators in C#) have a single next() method expected to return the next item from the sequence, or throw StopIteration exception, if there are no more items. And that’s a huge language design issue:
- Exception is misused here: it signals about the end of sequence, which is absolutely normal case — and moreover, a very frequent one.
- If you use an IDE with a debugger, this single thing makes “break on any exception” feature useless — unless you always tell to explicitly ignore this exception.
- Probably the worst consequence of this is that this single feature will slow down any JIT-based Python interpreter (PyPy, IronPython, etc.) — there interpreters have to either rely on native exceptions and pay a huge cost per every generator, or implement non-native exception handling closer to the way CPython does this, but in this case they have to pay an extra cost per every method call.
Anyway, this issue seems to be a very strong “showcase” explaining why it makes sense to know the fundamentals of exception handling for every developer.
2. Try-finally blocks
We covered one out of three cases I listed earlier:
- You want to catch and suppress an exception — we just discussed this
- You want a part of your code to perform a set of actions independently on whether an exception was thrown or not — we’re here now
So the question for this section is: when do I need a “finally” block?
The answer here is actually the simplest one: you need “finally” only in case you need to take certain actions immediately after the completion of “try” block, and independently on whether there was an error or not.
This sounds like a definition of what “finally” does, so let me clarify this further:
- The only action you should worry about is resource disposal.
- If it’s about language without garbage collection (e.g. C++), deallocation is probably #1 problem you should worry about here. If there is no “finally” and you don’t rely on auto pointers, you’re inevitably getting a memory leak on any exception.
- Languages with GC are very different though: GC will anyway destroy any unreferenced object, so it sounds like you should’t do anything here. But note that GC normally doesn’t provide any guarantee on how quickly the object is going to be destroyed after it becomes unreachable for the code — and that’s mostly why finally blocks are useful there.
Resource disposal in .NET
There are two main features in .NET runtime that are dedicated to resource disposal:
- Finalizers — these methods are similar to destructors in C++, the only difference is that they are invoked automatically at some point between the moment the object becomes unreachable (i.e. should be collected by GC) and the moment it’s actually collected. The underlying implementation of this feature is actually more complex than it seems, and fairly expensive — later I’ll show how much expensive it is. But for now it’s enough to know that finalizer is called very reliably, i.e. if you really need to do some cleanup no matter what, likely, you need a finalizer.
- IDisposable — that’s an interface with a single Dispose() method. This method is supposed to be invoked right at the point when you know for certain you won’t need the object that implements IDisposable further. The method itself is supposed to instantly dispose any resources held by the object.
- Both features are closely related: if you implement finalizer, you are always expected to implement IDisposable as well.
A few other rules related to IDisposable and finalizers are less well-known:
- Since finalizer should always go with IDisposable implementation, the only thing finalizer should do is to call Dispose. If you care about speed and inheritance, it’s better to implement even more complex pattern, but for now it’s enough to remember this.
- Ideally, Dispose() method must render the object completely unusable. I.e. it’s a good idea to nullify all its fields inside Dispose(), or do something else to make sure that almost any method call on this object will lead to an exception. Microsoft recommends to throw ObjectDisposedException in all of such cases, but on practice this requirement is not quite useful: it requires a boilerplate check for disposal in every method and property, which will impacts performance a bit, and moreover, no one really catches ObjectDisposedException, so debugging of too early disposal is the only thing you’ll simplify by making sure this exception is always thrown. This is why most of developers prefer to simply render the object unusable.
- Dispose() should never throw exceptions when called. It doesn’t mean you have to put “try-catch-everything” there — just note that normally all this method does is calling other Dispose methods on fields, and they also aren’t supposed to throw anything. But it’s a very very bad idea to explicitly throw an exception from Dispose — in fact, the only case you may want to do this is when you know for sure that the state you have now corrupted, and it’s important to signal about this right now rather than waiting for some (possibly hidden) problems in future. Also note that Dispose is almost always called from “finally” blocks, so throwing an exception from Dispose means you’re going to mask the original exception (which in turn supposed to mean that the new one you’re throwing is much more important).
- Dispose() should dispose all “owned” objects. Normally these objects are referenced by some fields, so if any object there supports IDisposable, you should dispose it from your own Dispose().
- Dispose() must support multiple invocations — i.e. not only the first call to Dispose should complete without an error, but also the second, third, etc. — for the same instance. This is necessary because the order of invocation of finalizers isn’t defined, and finalizers usually call Dispose. As I wrote above, Dispose implementation is supposed to call Dispose on all “owned” objects, thus some of such owned objects may get two Dispose call — one from the owner (when owner’s finalizer is invoked), and another one from its own finalizer.
- When to implement a finalizer? The only case you need it is when your object “owns” some resource that needs to be reliably disposed no matter what (i.e. even if developer will forget to do that explicitly), and moreover, this resource itself doesn’t have its own finalizer. Note that an “official” rule on of when you need a finalizer is a bit different: it’s necessary when you have some unmanaged resources, and as a result, even smart developers misunderstand that. Let me give you an example of when you need a finalizer, though you don’t have an unmanaged resource: you design a type requiring a temporary file to work, and when the instance of this type is created, it also creates that file. But when the object is going to be disposed, this file also needs to be removed. The class you’ll use to access the file will almost certainly have a finalizer, but this finalizer will only ensure that the underlying OS file handle is closed on disposal, i.e. it won’t delete the file itself. Consequently, you need a finalizer here, and the only reason for this is: if developer using your class somehow forgets to dispose it, the file won’t be removed. But your finalizer will ensure that file will be eventually removed — e.g. a few minutes later, but still, it will happen.
- When to implement IDisposable? If you have a finalizer, or “own” other disposables (i.e. there are fields castable to IDisposable), or you need to do some extra cleanup neither of your “owned” disposables does (e.g. a file removal in previous example).
Let’s see how disposables and finalizers are supposed to be implemented:
Now, let’s get back to “finally” — how all of this is related to finally blocks?
- If you allocate an IDisposable, and its lifetime ends inside the same method, you need try-finally; “finally” there should call Dispose for any of such disposables.
Note that IDisposable itself implements the same pattern, but for “owned” disposables referenced from object’s fields.
Now, are there any language features that simplify writing these try-finally blocks? Yes:
C# keywords relying on try-finally
There are 3 other C# keywords that use try-finally under the hood:
As you can see,
- “lock” is a shortcut for critical section; it needs “finally”, since otherwise the lock may not be released, and you end up with a deadlock. Note that “lock” is the only of these 3 keywords that doesn’t rely on IDisposable — the reasoning here is that you don’t want to do an extra allocation per every enter-exit, though the same is possible with “using” + IDisposable, if IDisposable is implemented by a value type (struct). So in reality the reasons for this are only historical — .NET “borrowed” lock syntax from Java, and as a result, you can use Monitor.Enter/Exit with any reference type there. Interestingly that no one needs this in practice— locks are typically held on private dedicated fields of Object type, and “lock (this)” syntax is explicitly recognized as an anti-pattern. So if you’ll be designing your own language some day, please borrow the patterns from other languages carefully :)
- “using” is a shortcut for “get disposable, use it, and dispose it”. This is an absolutely nice and quite useful feature missing in many other languages. Interesting that e.g. Python has context managers + “with” syntax, but “using” is actually much nicer: it solves a very common problem in absolutely minimalistic way (the only extra method you need to implement is IDisposable.Dispose). On contrary, in Python you have to implement both __enter__ and __exit__, you are allowed to return another object from __enter__, must consume an exception and lots of other info in __exit__ (which in turn means that “with” requires try-catch under the hood rather than try-finally, + locals for exception, etc.) — in short, this feature was never designed with performance in mind in Python. And on contrary, “using” in .NET is — it adds literally zero overhead, and as I mentioned above, it doesn’t box value types passed as IDisposables, so if you want to use it for something like locking or “scopes”, you may rely on structs implementing IDisposable to have zero heap allocations as well.
- “foreach” is probably something you didn’t expect to see here, but it relies on IEnumerable<T>\IEnumerator<T> API, and IEnumerator<T> is inherited from IDisposable — this is why it also requires “using”. But why all IEnumerator<T> instances are supposed to implement IDisposable? It’s clear that some complex enumerators may need this to dispose the resources right when you exit the loop (e.g. with “break”), but what about simple enumerators, e.g. for .Where or other LINQ-to-enumerable methods? In reality, it’s very logical for them as well — just imagine a case when a simple enumerator is relying on complex one under the hood, i.e. you write something like someQueryable.AsEnumerable().Where(…). Since most of enumerators own other enumerators (i.e. they own disposables), it’s totally logical to make all of them to implement IDisposable.
Note on finalization cost
I promised to show there is an extra cost associated with having a finalizer. A part of this cost is paid when such an instance is created:
The output (this time it’s from my Core-i7 laptop):
In plain English:
- You can allocate ~ 250M … 300M of simplest objects per second — both on CoreCLR and “vanilla” .NET 4.6.2; in fact, a bit more — there are minor extra expenses that aren’t factored out (loop itself + updating closure field).
- But if these objects are having a finalizer, the performance drops to ~ 7M … 10M allocations per second — i.e. the cost of having a finalizer is huge, so you don’t want to add it “just to be safe”. If you’re adding a finalizer, you should precisely know you need it.
- It’s not the whole cost paid for finalizers — the finalization itself (that happens in a separate thread, but still) has a comparable cost.
- CoreCLR seems to be a ~ 30% slower on regular allocations, but ~ 30% faster on allocations of objects with finalizers.
Summary on try-finally
If there would be no “using” keyword, your “finally” section would be responsible mainly for resource disposal. But since there is “using”, normally you won’t need even this — typically, you’ll have this code instead:
So normally you’ll use “finally” for other purposes — e.g. to implement lock-unlock pattern or something similar. In reality even this is rarely necessary in .NET — mainly, because most of types requiring this kind of behavior also provide IDisposable implementation to support dispose pattern — e.g. that’s how it works with AsyncLock from AsyncEx.
This is why if your “finally” looks complex (or even exists) in .NET, there is a good chance you’re doing something wrong. Probably the most frequent case is when your code needs it because your own types require some finalization, although they don’t implement IDisposable — i.e. basically, you have to rely on “finally” instead of “using” only because you don’t know .NET well enough.
3. Try-catch-(re)throw and exception masking
To recap, that’s where we are now:
- You want to catch and suppress an exception — we discussed this
- You want a part of your code to perform a set of actions independently on whether and exception was or wasn’t thrown — we discussed this
- You want to either mask an exception, i.e catch it and throw another one, or catch and re-throw it — this is where we are now.
Catch-and-rethrow case is fairly simple: it’s mostly used for logging, and for passing the information about exception to the “finally” block. Consider this example:
The last case to cover is catch-and-mask, i.e. throw some other exception instead. In general you should avoid this by all means, though there are a few rare cases where it makes sense:
- Cases similar to AggregateException — i.e. when you run a set of independent jobs, and some of them may fail. Rethrowing the error that occurred first is, of course, one of options in such cases, but in this case you won’t know anything about other exceptions. So if the second is definitely more important, you should aggregate all the errors and throw the exception that allows to enumerate all of them.
- Cases similar to TargetInvocationException — i.e. when it’s more important to indicate that an error has occurred on remote side (this isn’t confusing) rather than try to deserialize and throw it on your side like if it was local (this is definitely confusing). Note that TargetInvocationException.InnerException typically references the original error here, so in fact, you don’t lose anything — moreover, if you want, you can even re-throw the inner exception manually.
I can’t remember any other well-known examples of this, which means that you should think about doing something similar in a very very rare cases.
Are there some exceptions that might be thrown on almost any call?
Let’s list all of them:
- StackOverflowException — obviously. It’s interesting that with async-await it’s probably more likely to get it, because continuations there can be started right after a completion of a task, i.e. each continuation in chain of continuations recursively awaiting each other can extend the depth of call stack by ~10 stack frames. So I suspect stack is exhausted a bit faster in case with async methods. Though I never saw this causes any issues in real-life apps :)
- OutOfMemoryException — it can’t be thrown on calls that don’t perform any allocations, but since it’s hard to reason which call actually does, I recommend you to assume it can be thrown on any call unless you precisely know the call you make doesn’t do any allocations. You probably need an example, so here it is: may this call fail with OOM: dictionary.Remove(someKey)? Actually it can, and by a variety of reasons: a) Dictionary may decide to shrink its hash table (array), and to do that, it will allocate a smaller one first, and will copy the data from the old one after that. So this operation may require an extra memory — in fact, of O(dictionary.Count) size. b) someKey.GetHashCode or someKey.Equals may be implemented in such a way that they do allocations — e.g. due to boxing.
- TypeLoadException —since most of .NET code is loaded and compiled on demand, this exception may be thrown when the method you call has some dependencies (i.e. requires other types) that aren’t loaded yet, and .NET either can’t resolve these dependencies, or they can’t be loaded by some other reason (file is corrupted / no permissions to read it / etc.).
- ThreadAbortException — this exception can be thrown from almost arbitrary point, if your thread is aborted. This exception is special, because it is automatically rethrown at the end of each try-catch-finally block even if you don’t do that (though this behavior can be suppressed by ResetAbort).
So there 4 of such exceptions, but I suspect there might be something else :)
Should you do anything special about these exceptions? Nope. To deal with them, you should just properly place your try-finally / using / lock blocks.
To throw or not to throw?
Obviously, any throw can be replaced with a special type of result. So it’s good to know what’s preferable in a particular case — a special type of result, or an exception?
I don’t remember if I ever saw a very precise description of how to approach this — probably, because there is no “perfect” answer. The most important criteria are:
Throw an exception, if, when thrown:
- It normally can’t be meaningfully handled with try-catch — i.e. it indicates there is a mistake in caller’s logic. NullReferenceException, ArgumentException, ArgumentNullException, OverflowException, IndexOutOfRangeException, and KeyNotFoundException are examples of such errors: they almost always mean there is some mistake in caller’s logic. In fact, such exceptions are thrown by assertions that can be done only in runtime, though if it would be possible, you’d prefer them to trigger even at compile time.
- It indicates a relatively rare condition of uncontrollable nature, that makes it impossible to perform the requested operation. IsolatedStorageException, HttpRequestException and SmtpException are examples of such errors. Note that such errors may indicate there is a mistake in caller’s logic as well, though they can be thrown even if there is no error. Exceptions of this kind normally require try-catch blocks, and there is a high chance that such blocks will exist only in indirect callers — so if that’s the case, quite likely you should also think about custom exception type (I’ll write more on this further).
Return a special type of result, when:
- You can easily imagine a scenario when this type of result is going to be very frequent. int.Parse is a good example: imagine you’re a developer writing this method, and now you think how it’s going to be used. Counting all the numbers in a random text file is clearly one of imaginable scenarios, right? So if you want your method to support this scenario, you should always opt out for special type of result: frequently thrown exceptions make debugging way more complex, and moreover, they impact on performance. When you throw, it really should be an exception, but not an ordinary result. You may think that something similar is possible with e.g. HttpRequestException — e.g. when your internet is down. But actually that’s a very different case: if your internet (or some servers) is down, you can simply reduce the rate of requests / attempts to reduce the error rate without any negative consequences. But you can’t achieve the same with int.Parse assuming it’s used for counting numbers in a random text file without sacrificing the performance.
And finally, when in doubt, think of implementing both. In .NET such methods are normally implemented with TryXxx pattern:
- TResult Xxx(args…) — the one that throws on any errors
- bool TryXxx(args, out TResult result) — the one that throws only on “type 1” errors (i.e. when it’s clear there is a mistake in caller’s logic); otherwise it either returns false + default(TResult) (the operation can’t be performed), or true + actual result.
Why both methods can be useful? Let’s think about int.Parse again:
- If you’re parsing an integer from configuration file and there is no good way to handle non-integer value there, you should prefer int.Parse — in this case it will automatically cause the code reading configuration to fail, which, in turn, simply won’t let your app to start normally. IMO that’s way better than e.g. to implicitly substitute that integer with zero and get a completely unexpected behavior at some later point.
- If you’re counting all the integers in text file, you should prefer int.TryParse.
Note that C# 7 also supports tuples, and I suspect returning a tuple of (bool, TResult) is a bit faster alternative to bool + out TResult. But since no one uses this alternative just yet, I recommend to stick to the old one for now.
When a custom exception type is needed?
Almost never — you should always try to use the best fitting standard exception type first (i.e. any exception type from System namespace of mscorlib.dll assembly).
You need your own exception type, when:
- The exception you throw typically needs a dedicated try-catch handler, so it’s “type 2” exception (somewhat similar to HttpRequestException), and
- Quite likely it going to be caught and processed by indirect callers of your code. If it’s the case and you’re use some shared exception type, you must be certain that try-catch handlers for this type are always identical to what your exception requires — because there is a chance that some other code invoked by indirect caller of your code will trigger a similar exception, and since indirect caller won’t be able to distinguish between these two, it will handle both of them by the same way.
Note that System namespace doesn’t provide any standard exceptions that resemble “type 2” exceptions — i.e. all the standard exception types are “type 1” exceptions, and they cover almost all imaginable scenarios. So another indication that you need your own exception type is when you can’t find a match among standard ones.
- “Choosing the right type of exception to throw” — short article
- “Using standard exception types” — very short as well
- “Design guidelines for exceptions” — MSDN section, lots of info.
Anything else to know about exceptions?
- Do not catch System.Exception or System.SystemException — unless you’re going to rethrow it. If you do this, you’re going to catch everything, including exceptions you can’t meaningfully handle (e.g. TypeLoadException).
- Do not throw System.Exception or any other non-abstract exception type with a broad hierarchy of descendants — just think what an exception handler catching such exceptions will be catching as well.
- More similar rules can be found here, including such seemingly crazy ones as “Do not throw StackOverflowException explicitly”. I highly recommend to read this — the article is very short (there are no explanations for any of listed rules), but if you’ve managed to read this post till this point, it should be a piece of cake for you to figure out why all these rules are very logical.