Understanding coroutines in Kotlin

Published in

PlayKids Tech Blog

11 min readAug 29, 2019

Building complex systems involves several long-running tasks: network requests, database access, or even local computations that may effectively freeze a thread before a response is ready.

For the simplest cases, the synchronous communication style is sufficient. Threads can block and wait for a response from an API to continue processing data, for example. It is guaranteed that the method will not continue until it has a response from the database, and so on.

The synchronous style of communication looks and feels natural but has its drawbacks. When developing mobile applications, the main thread cannot stop and wait for a reply, else the user will not be able to interact with the screen nor will the application process events. In back-end systems, as the number of incoming requests grows, even the largest of thread pools can be fully consumed, and that doesn't necessarily mean that all threads are processing data. Many of them might just be blocked waiting for a response.

Asynchronous programming usually comes for the rescue to work around these issues. They come in the form of callbacks, futures, promises, reactive streams, among others. Their core idea is simple: to delegate an operation to be executed out of the current thread. That way, the calling thread is free to take on more work while it is waiting for the result of the operation, if it cares about its result at all.

To start working with the async paradigm, the developer needs to learn to think asynchronously and to structure code depending on the selected approach. The result is code that is usually harder to read, debug, and maintain. As the size and complexity of the system grow, it can become a real pain to keep track of everything that is happening.

Coroutines solve this problem by providing a way of writing asynchronous code in a synchronous fashion, greatly simplifying code readability and development. More than that, it enables a new and safer paradigm to handle concurrency and communication.

Kotlin supports coroutines since version 1.1, when it was released as an experimental API. It graduated to a stable API on Kotlin version 1.3. Unlike C# or Go, which have coroutine support embedded in the language, Kotlin offers coroutines via libraries.

Defining coroutines

From the Kotlin Coroutines Design Proposal:

A coroutine can be thought of as an instance of suspendable computation, i.e. the one that can suspend at some points and later resume execution possibly on another thread.

Suspendable computation is code that, instead of blocking the running thread waiting for a result, suspends (or pauses) its execution until a response is available. When a suspension happens, the running context can be taken out of the current thread and be easily switched by other workloads. That way, the waiting thread is free to perform more computations until it has a response back. In Kotlin, suspendable methods are marked with the special keyword suspend, like in the function below.

This function will print for how long it will delay the execution, wait for a given amount of milliseconds and then resume execution. In this example, instead of using Thread.sleep() from the java.lang package, we are using the special functiondelay() from kotlinx.coroutines. If we had used the former, our method would block the running thread for the amount of time defined, effectively preventing it from doing more work while waiting. With the latter, the moment delay is called, the running thread is free to perform more tasks until delayMillis runs out. In this case, simpleDelay has only one suspension point, the one at the line wheredelay is called. Sincedelay is a suspending function, we cannot call it from a regular function without wrapping it in a coroutine context, otherwise, we get an error in compilation time. Therefore, we mark simpleDelay with the suspend keyword for now, so that it's the caller responsibility to build and start the coroutine.

Concepts and elements

Continuation

With a callback approach, a method will receive one or more lambdas with the computation result as an argument and the program execution will continue with the provided lambdas.

Example of a callback-style function

Callbacks are one of the simplest ways of performing asynchronous computations, but it may lead to excessive nesting (a.k.a. callback hell) and unreadable code. However, passing a continuation lambda to a function — or writing a function in continuation-passing style — is exactly what the Kotlin compiler does under the hood when a function is marked with the suspend keyword. The continuation is inserted in compilation time as an extra argument to the function, and the compiler also creates a finite state machine to each suspending function to determine what to do depending on the suspension point and its result. That way, code is asynchronous but it is written and read just like synchronous code. There is no need to learn new APIs or structure code in other (harder) ways.

Job and Deferred

A Job represents the lifecycle of a coroutine. At any moment, a Job may be in new, active, completing, canceling, canceled or completed state. It can have a parent-child relationship with other jobs, so that if the parent is canceled, all of its children are also canceled, and if a child coroutine throws an unhandled exception, the parent will be canceled as well. When used in a coroutine context, the job represents the coroutine itself.

Jobs do not return results, but some functions do return values. In such scenarios, one would use a Deferred. Deferreds are Jobs with convenience methods to access the successful or failed result of a computation.

Coroutine context

The coroutine machinery needs configurations to work properly, and they are stored in an object called CoroutineContext. It is a set of singleton objects, each of which controls an aspect of the coroutine, for example, the thread pool to use, its Job instance, a specific exception handler, etc. Context elements can be extended as desired, providing good options to work with coroutines in application-specific ways. When creating a coroutine, we can specify which elements it will hold in its context. Once created, a context will never change, but that is not a problem since it is easy to create another coroutine with a different context.

The job and its associated context are like local variables a coroutine uses during execution. These variables can be put in and taken out of threads easily with little overhead due to its size, so just like threads are lightweight processes, coroutines can be thought of as lightweight threads, except it does not make a system call to perform context switching because we are working on top of threads. Therefore, coroutines offer better performance compared to plain old threads.

Scope

Kotlin defines CoroutineScope as a simple interface that just holds a coroutine context. The idea of each scope is to bind (or reference) new coroutines to an existing context and Job. The new context contains all elements from the parent context but can override some of them if desired. The Job associated with the current scope will act as a parent to the new coroutine. Just like code belongs to a particular scope (that of a class, method, statement), and the scope of execution is only finished when the entire enclosed block is completed, a coroutine scope is only considered complete when all children are completed.

The automatic relationship between Jobs and contexts provided by scopes provides safer code since an uncaught exception thrown in any child coroutine will eventually trigger the cancellation of its parent Job and any other associated children without manual intervention or risks of resource leaks and the bugs they carry.

Coroutine builders

Having the basic elements defined, it is time to bring coroutines to life! Coroutines are created with special functions called builders. Each builder defines a specific coroutine behavior, but for now, we will focus on three of them.

The first is runBlocking. With this builder, whenever a suspension point is reached, the current running thread is actually blocked until its completion. It seems a bit odd to have runBlocking at first, since one of the goals of coroutines is to provide non-blocking experiences to applications. However, let's remember that suspending functions can only be called from coroutines or other suspending functions. The runBlocking builder takes us from the normal world to the suspending world, so it should only be used when we want to cross that bridge — that is, in the main function or tests.

Sometimes we want to start a task and do not expect a result back. We can use the coroutine builder launch for such purpose. When a coroutine is created with launch, a Job instance is returned, and we can check its state or cancel the coroutine.

Since there is only a reference to a Job with launch, we cannot retrieve any value from the execution. What if we need a result? We should then use another builder, async. It creates a new coroutine and returns an instance of Deferred<T>. We can call await() on a deferred reference to get a result if available or suspend otherwise.

Coroutines can specify a CoroutineDispatcherto send tasks to a thread pool if one is specified on coroutine creation. When coroutine dispatchers with thread pools of size greater than one are used with launch and async, we can effectively process workloads in parallel with all the benefits of the lightweight context switching of coroutines. The coroutine library provides standard implementations of coroutine dispatchers optimized to use cases like IO-intensive or CPU-intensive workloads.

Example: the analogy of a restaurant kitchen

Let's see how coroutines could help us model the work done in the kitchen of our brand new restaurant, Kotchen. Here is a mental model to guide us through the code below.

Chefs get tasks and perform work, so chefs are threads.
Naturally, the pool of available chefs is the thread pool.
Each dish has its own recipe — a method to assemble all the ingredients into something edible.
Cooking is executing the recipe to get a dish.
Each ingredient takes some time to prepare. We must wait until it is ready.

Let's break execution in steps:

We first create a thread pool of only one chef and its dispatcher.

For each dish, we create a recipe — a higher-order function that returns a suspendable lambda that creates a dish based on its name, ingredients and the time it takes to prepare them.

We start preparing food. Since we are moving into the suspending world, we must use runBlocking. Using this builder at this moment means that any suspension within its scope will block the main thread.

We create two new coroutines with launch, passing chefDispatcher as the argument. These coroutines will run on the thread belonging to chefs, and not on the main thread. Using launch, we are free to do more work after spinning up coroutines, even creating launching more coroutines, which is what we do. Since it takes some time to prepare all the ingredients, these coroutines will execute concurrently. If we had used a blocking approach, we would have to wait until each ingredient was ready to start preparing another one — a complete waste of time! Also note that, since the created coroutines belong to the runBlocking scope, both must complete before the rest of the main function can be executed.

Lastly, we shut down the thread pool so that the program can finish gracefully. Here is how our single chef handled the orders:

🍳 Preparing beef with eggs
🍛 Chef pool-1-thread-1: I’ll prepare rice now (takes 8 seconds)
🍛 Chef pool-1-thread-1: I’ll prepare beans now (takes 15 seconds)
🍳 Preparing hamburger
🍛 Chef pool-1-thread-1: I’ll prepare beef now (takes 10 seconds)
🍛 Chef pool-1-thread-1: I’ll prepare eggs now (takes 4 seconds)
🍛 Chef pool-1-thread-1: I’ll prepare bread now (takes 1 seconds)
🍛 Chef pool-1-thread-1: I’ll prepare burger now (takes 5 seconds)
🍛 Chef pool-1-thread-1: I’ll prepare cheese now (takes 2 seconds)
🍛 Chef pool-1-thread-1: bread is ready
🍛 Chef pool-1-thread-1: cheese is ready
🍛 Chef pool-1-thread-1: eggs is ready
🍛 Chef pool-1-thread-1: burger is ready
👨‍🍳 After 5 seconds hamburger (contains cooked bread, cooked burger, cooked cheese) is ready!
🍛 Chef pool-1-thread-1: rice is ready
🍛 Chef pool-1-thread-1: beef is ready
🍛 Chef pool-1-thread-1: beans is ready
👨‍🍳 After 15 seconds beef with eggs (contains cooked rice, cooked beans, cooked beef, cooked eggs) is ready!

How could our chef handle so much at the same time? Take a closer look at the function prepareIngredient. It will delay for a specified amount of time until we have an ingredient ready. Here's the magic: when we delay, the current thread (our chef) is free to take care of other things, including cooking more ingredients or combining the prepared ones into a plate— and also reducing the total time elapsed preparing both dishes from 45 to 15 seconds (the minimum time to have the beans ready). Now, what if the threads were handling HTTP requests that way? We could serve many more clients!

We used async with prepareIngredients because we were interested in the String result of the function. If we did not care about it, we could have used launchinstead. However, when we used async in recipe, we got an instance of Deferred<String>back, on which we must call await. We used the extension function map, so after calling async, we had a list of Deferreds. We then used awaitAll, which is equivalent to call await on each of the list elements.

What if we had more chefs? Would the cooking be any faster? The answer is no. The chefs could get one ingredient or the other, but the delay would be the same for everyone:

🍳 Preparing beef with eggs
🍛 Chef pool-1-thread-2: I’ll prepare rice now (takes 8 seconds)
🍛 Chef pool-1-thread-3: I’ll prepare beans now (takes 15 seconds)
🍛 Chef pool-1-thread-3: I’ll prepare beef now (takes 10 seconds)
🍛 Chef pool-1-thread-2: I’ll prepare eggs now (takes 4 seconds)
🍳 Preparing hamburger
🍛 Chef pool-1-thread-2: I’ll prepare bread now (takes 1 seconds)
🍛 Chef pool-1-thread-2: I’ll prepare burger now (takes 5 seconds)
🍛 Chef pool-1-thread-2: I’ll prepare cheese now (takes 2 seconds)
🍛 Chef pool-1-thread-2: bread is ready
🍛 Chef pool-1-thread-1: cheese is ready
🍛 Chef pool-1-thread-3: eggs is ready
🍛 Chef pool-1-thread-2: burger is ready
👨‍🍳 After 5 seconds hamburger (contains cooked bread, cooked burger, cooked cheese) is ready!
🍛 Chef pool-1-thread-3: rice is ready
🍛 Chef pool-1-thread-1: beef is ready
🍛 Chef pool-1-thread-2: beans is ready
👨‍🍳 After 15 seconds beef with eggs (contains cooked rice, cooked beans, cooked beef, cooked eggs) is ready!

When a suspension point is ready to continue running, the dispatcher will select any free thread available at the moment to continue processing the task. You can verify that by checking that Chef 3 started cooking beans, but it was Chef 2 who completed it. Coroutines are not bound to a particular thread in a thread pool unless explicitly specified.

It's a wrap, folks! Next time you are developing something that requires performance and asynchronicity, remember that coroutines are friends.