Exceptional Exceptions for Coroutines made easy…? part II: Supervision

Anton Spaans
The Kotlin Chronicle
6 min readJul 18, 2019

In my earlier article, we examined how exceptions are handled by Coroutines and Coroutine Scopes that run with a Job, which is the default way of running them, without any so-called supervision.

Coroutines support Structured Concurrency, in which a parent-Coroutine that runs with a Job and its child-Coroutines all live and die together:

  1. A parent-Coroutine finishes only after all its child-Coroutines have finished.
  2. When a parent-Coroutine or scope finishes abnormally, either through cancelation or through an exception, all its child-Coroutines, that are still active, are canceled and new child-Coroutines can no longer be launched.
  3. When a child-Coroutine finishes abnormally, its parent-Coroutine or scope finishes abnormally.

To learn more about the details, I suggest reading part I first:

What is the problem of not using Supervision?

Let’s examine two scenarios highlighting problems that supervision can solve.

1. UI handling asynchronous requests

Say we have some UI-screens (Views, Activities — hello Android devs!, etc.) and our app uses Coroutines as the preferred way of dealing with asynchrony.

Multiple requests to a remote server can be issued during the lifetime of a UI-screen, at the initial showing of the screen or on actions initiated by the user.

We model the requests to the remote server by running them within the scope (CoroutineScope) of the UI-screen and launch these requests as (child-)Coroutines from that UI-scope. By default, these Coroutines will be the children of a parent Job. As soon as the first request fails, any future launches will fail as well, immediately. This happens because the parent Job of the UI-scope would have failed as well and it won’t be able to launch any more Coroutines. Remember point (3.) of the list above: When a child-Coroutine finishes abnormally, its parent-Coroutine or scope finishes abnormally.

This is not the behavior we want. The failure of one request should not render future interactions with the UI useless.

2. Failure of one must not fail all

When we launch a bunch of async Coroutines in a regular scope without supervision, the failure of one will cancel all the other active ones as well. The entire parallel operation returns a singular result; either a success or a failure, nothing in the middle.

With supervision, the failure of one will not cause the cancelation of others. The result of the entire parallel operation that uses supervision is a list of results, each of these results is either a success or a failure.

What is Supervision in Coroutines?

While still supporting Structured Concurrency, a parent-Coroutine or scope that uses supervision and its child-Coroutines live but don’t necessarily die together.

Points (1.) and (2.) of the list that described Structured Concurrency have not changed. When a parent-Coroutines finishes, all its child-Coroutines finish. However, point (3.) has changed. With supervision it will read:

3. When a child-Coroutine finishes abnormally, its parent-Coroutine just handles it and keeps going.

In our example UI scenario, when we use supervision, the Coroutine of the failed request still fails, but the UI-scope stays alive and it can launch other Coroutines later.

We can enable supervision in a scope (CoroutineScope), by installing a SupervisorJob into its context instead of a plain Job and launching the top-most Coroutines from it:

val uiScope = CoroutineScope(SupervisorJob() + Dispatchers.Main)
...
val supervisedChild1 = uiScope.launch { ... }
...
val supervisedChild2 = uiScope.async { ... }
...

We can use supervision within an already existing Coroutine by running a child-Coroutine that will become the supervising parent of its respective children.

val result = supervisorScope {
...
val supervisedChild1 = this.launch { ... }
...
val supervisedChild2 = this.async { ... }
...
}

In both examples, if one supervised child finishes with an exception, the other supervised child and the supervising parent just keep running.

How are Exceptions handled?

We now know that a parent Coroutine or a CoroutineScope that runs with a SupervisorJob keeps running when a child-Coroutine finishes with an exception. But, how are exceptions handled by such a parent?

Exceptions are handled the same way, whether the Coroutine’s scope uses supervision or not: Only a top-most Coroutine launched from a call to CoroutineScope.launch will handle the exception either by installing a CoroutineExceptionHandler or, if one is not installed, by letting the Thread’s Uncaught Exception Handler process it.

val ceh = CoroutineExceptionHandler { ... }
...
val scope: CoroutineScope = ...
...
val supervisedChild = scope.launch(ceh) { ... }

The scope variable must be a CoroutineScope that can launch top-most Coroutines. This means that:

  • The variable scope should be a newly created CoroutineScope that may or may not have a CoroutineExceptionHandler installed. Both examples below will let ceh handle any uncaught exception.
val scope = CoroutineScope(SupervisorJob() + ceh)
...
val supervisedChild = scope.launch { ... }

or

val scope = CoroutineScope(SupervisorJob())
...
val supervisedChild = scope.launch(ceh) { ... }
val topScope = CoroutineScope(SupervisorJob() + ceh)
topScope.launch {
...
...
supervisorScope { // 'this' is a supervising CoroutineScope
...
this.launch { ... }
}
...
...
}

or

val topScope = CoroutineScope(SupervisorJob())
topScope.launch {
...
...
supervisorScope {
...
this.launch(ceh) { ... }
}
...
...
}

or even this (not recommended)

val topScope = CoroutineScope(SupervisorJob())
topScope.launch {
...
...
withContext(ceh) {
...
supervisorScope {
...
this.launch { ... }
}
...
}
...
}

Code Examples

Let’s look at some code examples.

The examples are hosted on https://play.kotlinlang.org and that’s why you see the SHOW EMBED message in their place. Click the button to reveal the code samples. If one fails to load, try right-clicking on the area and select “Reload Frame”. The code examples are live and you can run them by hitting the green play button.

The example below has a Coroutine for child1 that does not catch the exception thrown by the call to oops().

When providing a Job, scope finishes because child1 finishes with the exception. That means that child2 gets canceled and child3 never launches.

When providing a SupervisorJob, scope does not finish. That means that child2 keeps running until its end and child3 will get launched.

Below are two examples showing how CoroutineExceptionHandlers can be installed to handle exceptions thrown by Coroutines.

The Coroutine of supervisorScope uses supervision. Even though one of its children finishes with an exception (grandChild1_1), it keeps on running. The Coroutine that called five() will, therefore, keep running as well and successfully obtain the value 5.

The first example installs the cehParent into the outer scope. This handler is then inherited by the call to supervisorScope. The cehParent will handle the exception that is not caught by grandChild1_1.

The second example installs the cehChild into the top-most scope of supervisorScope. Even though it inherits the outer cehParent, it is overridden by cehChild that will handle the exception not caught by grandChild1_1.

The next example shows how we could install a SupervisorJob into a scope incorrectly. We may expect grandChild1_1 and grandChild1_2 to be the direct child-Coroutines of the supervising child1. That is incorrect and grandChild1_2 does not keep running, it will be canceled. What is going on…?

The child1 Coroutine is the grand-parent of grandChild1_1 and grandChild1_2! Their direct (hidden) parent runs with a plain Job. When grandChild1_1 finishes with the exception, that Job finishes and grandChild1_2 is canceled.

The child2 Coroutine keeps running, however. This is because child1 still runs with a SupervisorJob and the failure of its (hidden) child-Coroutine does not cause it to finish with an exception. The scope from which child1 was launched does, therefore, not finish either and child2 keeps running until its end.

The reasoning is a bit complex. Just remember to never install a SupervisorJob the way as it is shown in the example below.

Using supervision allows us to avoid the failure of an entire parallel operation when only one or a few of the concurrent operations fail.

The example below implements two ways of getting a list of Deferred<String>. One uses a non-supervising coroutineScope call (allOrNothing), the other uses a supervising supervisorScope call (allOrSome). Run the example and explain what happens in the comments of this article 🙂.

Recap

If nothing else, remember this list that describes Structured Concurrency for Coroutines:

  1. A parent-Coroutine finishes only after all its child-Coroutines have finished.
  2. When a parent-Coroutine or scope finishes abnormally, either through cancelation or through an exception, all its child-Coroutines, that are still active, are canceled and new child-Coroutines can no longer be launched.
  3. When a child-Coroutine finishes abnormally, its parent-Coroutine or scope (a) finishes abnormally if the parent is not a supervisor or (b) keeps running if the parent is a supervisor.

Have an exceptional Kotlin!

--

--

Anton Spaans
The Kotlin Chronicle

Associate Director of Product Engineering at @AccentureSong. You can find me online @streetsofboston or at https://www.linkedin.com/in/antonspaans/