Exceptional Exceptions for Coroutines made easy…? part II: Supervision
In my earlier article, we examined how exceptions are handled by Coroutines and Coroutine Scopes that run with a Job
, which is the default way of running them, without any so-called supervision.
Coroutines support Structured Concurrency, in which a parent-Coroutine that runs with a Job
and its child-Coroutines all live and die together:
- A parent-Coroutine finishes only after all its child-Coroutines have finished.
- When a parent-Coroutine or scope finishes abnormally, either through cancelation or through an exception, all its child-Coroutines, that are still active, are canceled and new child-Coroutines can no longer be launched.
- When a child-Coroutine finishes abnormally, its parent-Coroutine or scope finishes abnormally.
To learn more about the details, I suggest reading part I first:
What is the problem of not using Supervision?
Let’s examine two scenarios highlighting problems that supervision can solve.
1. UI handling asynchronous requests
Say we have some UI-screens (Views, Activities — hello Android devs!, etc.) and our app uses Coroutines as the preferred way of dealing with asynchrony.
Multiple requests to a remote server can be issued during the lifetime of a UI-screen, at the initial showing of the screen or on actions initiated by the user.
We model the requests to the remote server by running them within the scope (CoroutineScope
) of the UI-screen and launch
these requests as (child-)Coroutines from that UI-scope. By default, these Coroutines will be the children of a parent Job
. As soon as the first request fails, any future launch
es will fail as well, immediately. This happens because the parent Job
of the UI-scope would have failed as well and it won’t be able to launch any more Coroutines. Remember point (3.) of the list above: When a child-Coroutine finishes abnormally, its parent-Coroutine or scope finishes abnormally.
This is not the behavior we want. The failure of one request should not render future interactions with the UI useless.
2. Failure of one must not fail all
When we launch a bunch of async
Coroutines in a regular scope without supervision, the failure of one will cancel all the other active ones as well. The entire parallel operation returns a singular result; either a success or a failure, nothing in the middle.
With supervision, the failure of one will not cause the cancelation of others. The result of the entire parallel operation that uses supervision is a list of results, each of these results is either a success or a failure.
What is Supervision in Coroutines?
While still supporting Structured Concurrency, a parent-Coroutine or scope that uses supervision and its child-Coroutines live but don’t necessarily die together.
Points (1.) and (2.) of the list that described Structured Concurrency have not changed. When a parent-Coroutines finishes, all its child-Coroutines finish. However, point (3.) has changed. With supervision it will read:
3. When a child-Coroutine finishes abnormally, its parent-Coroutine just handles it and keeps going.
In our example UI scenario, when we use supervision, the Coroutine of the failed request still fails, but the UI-scope stays alive and it can launch other Coroutines later.
We can enable supervision in a scope (CoroutineScope
), by installing a SupervisorJob
into its context instead of a plain Job
and launching the top-most Coroutines from it:
val uiScope = CoroutineScope(SupervisorJob() + Dispatchers.Main)
...
val supervisedChild1 = uiScope.launch { ... }
...
val supervisedChild2 = uiScope.async { ... }
...
We can use supervision within an already existing Coroutine by running a child-Coroutine that will become the supervising parent of its respective children.
val result = supervisorScope {
...
val supervisedChild1 = this.launch { ... }
...
val supervisedChild2 = this.async { ... }
...
}
In both examples, if one supervised child finishes with an exception, the other supervised child and the supervising parent just keep running.
How are Exceptions handled?
We now know that a parent Coroutine
or a CoroutineScope
that runs with a SupervisorJob
keeps running when a child-Coroutine finishes with an exception. But, how are exceptions handled by such a parent?
Exceptions are handled the same way, whether the Coroutine’s scope uses supervision or not: Only a top-most Coroutine launched from a call to CoroutineScope.launch
will handle the exception either by installing a CoroutineExceptionHandler
or, if one is not installed, by letting the Thread’s Uncaught Exception Handler process it.
val ceh = CoroutineExceptionHandler { ... }
...
val scope: CoroutineScope = ...
...
val supervisedChild = scope.launch(ceh) { ... }
The scope
variable must be a CoroutineScope
that can launch top-most Coroutines. This means that:
- The variable
scope
should be a newly createdCoroutineScope
that may or may not have aCoroutineExceptionHandler
installed. Both examples below will letceh
handle any uncaught exception.
val scope = CoroutineScope(SupervisorJob() + ceh)
...
val supervisedChild = scope.launch { ... }
or
val scope = CoroutineScope(SupervisorJob())
...
val supervisedChild = scope.launch(ceh) { ... }
- Or the variable
scope
should be the root of asupervisorScope { ... }
that inherits theCoroutineExceptionHandler
from the outer scope if one has been installed. All three examples below will letceh
handle any uncaught exception.
val topScope = CoroutineScope(SupervisorJob() + ceh)
topScope.launch {
...
...
supervisorScope { // 'this' is a supervising CoroutineScope
...
this.launch { ... }
}
...
...
}
or
val topScope = CoroutineScope(SupervisorJob())
topScope.launch {
...
...
supervisorScope {
...
this.launch(ceh) { ... }
}
...
...
}
or even this (not recommended)
val topScope = CoroutineScope(SupervisorJob())
topScope.launch {
...
...
withContext(ceh) {
...
supervisorScope {
...
this.launch { ... }
}
...
} ...
}
Code Examples
Let’s look at some code examples.
The examples are hosted on https://play.kotlinlang.org and that’s why you see the
SHOW EMBED
message in their place. Click the button to reveal the code samples. If one fails to load, try right-clicking on the area and select “Reload Frame”. The code examples are live and you can run them by hitting the green play button.
The example below has a Coroutine for child1
that does not catch the exception thrown by the call to oops()
.
When providing a Job
, scope
finishes because child1
finishes with the exception. That means that child2
gets canceled and child3
never launches.
When providing a SupervisorJob
, scope
does not finish. That means that child2
keeps running until its end and child3
will get launched.
Below are two examples showing how CoroutineExceptionHandler
s can be installed to handle exceptions thrown by Coroutines.
The Coroutine of supervisorScope
uses supervision. Even though one of its children finishes with an exception (grandChild1_1
), it keeps on running. The Coroutine that called five()
will, therefore, keep running as well and successfully obtain the value 5
.
The first example installs the cehParent
into the outer scope. This handler is then inherited by the call to supervisorScope
. The cehParent
will handle the exception that is not caught by grandChild1_1
.
The second example installs the cehChild
into the top-most scope of supervisorScope
. Even though it inherits the outer cehParent
, it is overridden by cehChild
that will handle the exception not caught by grandChild1_1
.
The next example shows how we could install a SupervisorJob
into a scope incorrectly. We may expect grandChild1_1
and grandChild1_2
to be the direct child-Coroutines of the supervising child1
. That is incorrect and grandChild1_2
does not keep running, it will be canceled. What is going on…?
The child1
Coroutine is the grand-parent of grandChild1_1
and grandChild1_2
! Their direct (hidden) parent runs with a plain Job
. When grandChild1_1
finishes with the exception, that Job
finishes and grandChild1_2
is canceled.
The child2
Coroutine keeps running, however. This is because child1
still runs with a SupervisorJob
and the failure of its (hidden) child-Coroutine does not cause it to finish with an exception. The scope
from which child1
was launched does, therefore, not finish either and child2
keeps running until its end.
The reasoning is a bit complex. Just remember to never install a SupervisorJob the way as it is shown in the example below.
Using supervision allows us to avoid the failure of an entire parallel operation when only one or a few of the concurrent operations fail.
The example below implements two ways of getting a list of Deferred<String>
. One uses a non-supervising coroutineScope
call (allOrNothing
), the other uses a supervising supervisorScope
call (allOrSome
). Run the example and explain what happens in the comments of this article 🙂.
Recap
If nothing else, remember this list that describes Structured Concurrency for Coroutines:
- A parent-Coroutine finishes only after all its child-Coroutines have finished.
- When a parent-Coroutine or scope finishes abnormally, either through cancelation or through an exception, all its child-Coroutines, that are still active, are canceled and new child-Coroutines can no longer be launched.
- When a child-Coroutine finishes abnormally, its parent-Coroutine or scope (a) finishes abnormally if the parent is not a supervisor or (b) keeps running if the parent is a supervisor.