Understanding Coroutines & Tasks in Depth in Python

Understanding core concepts of Asynchronous Programming

--

Introduction

Asynchronous programming in Python offers a powerful way to improve the efficiency of applications, especially when dealing with I/O-bound and high-latency operations. Unlike traditional, synchronous programming, where tasks are executed sequentially, asynchronous programming allows tasks to be paused and resumed, leading to better resource utilization and responsiveness. This article delves into the core concepts of coroutines, event loops, and tasks, providing a deeper understanding of how they function and highlighting their advantages. We will explore code examples to illustrate the behaviour and benefits of these concepts.

Difference Between Normal Functions and Coroutines

Consider the below analogy:

Normal Function: Think of a normal function as baking a cake. You follow the recipe from start to finish without stopping, and nothing else gets done until the cake is ready. During this time, your entire focus is on baking the cake, and you can’t work on anything else.

Coroutine: Now, imagine making a meal with multiple dishes. You start cooking one dish but can pause it to begin another dish while waiting for the first to cook. This way, you can manage multiple dishes at once, switching between them as needed and efficiently using your time and resources.

The above analogy demonstrates how normal functions are like single-focus tasks, while coroutines allow for multitasking by alternating between different activities.

let’s explore the difference between a normal function and a coroutine using a concrete example:

async def coroutine_multiply_by_two(number: int) -> int:
return number * 2

def multiply_by_two(number: int) -> int:
return number * 2

function_result = multiply_by_two(2)
coroutine_result = coroutine_multiply_by_two(2)

print(f'Function result is {function_result} and the type is {type(function_result)}')
print(f'Coroutine result is {coroutine_result} and the type is {type(coroutine_result)}')


>
Function result is 4 and the type is <class 'int'>
Coroutine result is <coroutine object coroutine_multiply_by_two at 0x10320db40>
and the type is <class 'coroutine'>

When we call the normal multiply_by_two function, it runs immediately and returns the expected integer result. However, when we call coroutine_multiply_by_two, the code inside the coroutine doesn't execute right away. Instead, it returns a coroutine object, which means the coroutine has been defined but hasn't been run yet.

This distinction is essential to understanding how coroutines work: they don’t execute upon being called. Instead, they produce a coroutine object that can be scheduled and executed later by an event loop. This separation between defining and running a coroutine allows for more flexible and efficient task management.

But what exactly is an event loop? Imagine the event loop as a conductor in an orchestra, making sure each musician (or task) knows when to start playing and when to pause. In the context of programming, the event loop oversees the execution of different tasks, handling them one by one in a way that no single task holds up the entire program.

In more technical terms, the event loop is a core component of asynchronous programming in Python. It manages the execution of multiple coroutines, switching between them as they await I/O operations or other tasks. The event loop ensures that your program remains responsive by handling these operations without blocking the main execution flow. By coordinating the execution of coroutines, the event loop facilitates concurrent task processing.

Basic Event Loop

With this foundational understanding of coroutines and the event loop, it’s important to know how to actually run these coroutines within the event loop. To execute coroutines, we use specific tools provided by Python’s asyncio library. The asyncio.run() function is a simple way to start the event loop and run our main coroutine, while the await keyword allows us to pause and resume coroutines within it.

With this, let’s see how we can run coroutines using the asyncio.run() function and the await keyword.

Running Coroutines with asyncio.run() and await

To fully appreciate the benefits of coroutines and see how multiple tasks can be managed, we need to simulate some long-running operations. Using asyncio.sleep is a practical way to mimic tasks such as making web API requests or querying a database, as these are typical I/O-bound tasks.

To execute coroutines, we need to run them within an event loop. Here’s how this is done using asyncio.run() and the await keyword:

import asyncio

async def fetch_data(data_id: int) -> str:
print(f'Fetching data for ID {data_id}')
await asyncio.sleep(2) # Simulates a delay like a network request
print(f'Data fetched for ID {data_id}')
return f'Data {data_id}'

async def compute_result(value: int) -> int:
await asyncio.sleep(1) # Simulates a delay like a computation
return value * 2

async def process_data() -> None:
data = await fetch_data(1)
result = await compute_result(5)
print(f'Result: {result}')
print(f'Processed Data: {data}')

asyncio.run(process_data())


>
Fetching data for ID 1
Data fetched for ID 1
Result: 10
Processed Data: Data 1
took total of 3.0038 seconds

In this example, we use asyncio to run coroutines and see how they work. The asyncio.run(process_data()) function is the starting point of the asyncio process, which kicks off the event loop and executes our main coroutine.

Inside process_data, we use await to call fetch_data and compute_result. When the await keyword is used, the coroutine pauses, allowing other operations to run. Once the awaited task completes, execution resumes from where it left off.

Although this example shows how to run coroutines and handle pauses with await, the flow remains mostly sequential because await pauses the current coroutine, preventing other code within that coroutine from executing until the awaited operation is finished. This behavior is intentional and essential for managing the execution flow in asynchronous programming.

In the next section, we’ll enhance this by using tasks to achieve true concurrency and better utilise the power of asynchronous programming.

Utilizing Tasks for Concurrency

To understand the benefits of using tasks for concurrency, let’s first look at a scenario where we only use coroutines. In this case, the coroutines will run sequentially, which can be inefficient if there’s a lot of waiting or sleeping involved.

import asyncio

async def fetch_data(data_id: int) -> None:
print(f'Fetching data for ID {data_id}')
await asyncio.sleep(3) # Simulates waiting for a response from a server
print(f'Finished fetching data for ID {data_id}')

async def main() -> None:
await fetch_data(1)
await fetch_data(2)
await fetch_data(3)

asyncio.run(main())


>
Fetching data for ID 1
Finished fetching data for ID 1
Fetching data for ID 2
Finished fetching data for ID 2
Fetching data for ID 3
Finished fetching data for ID 3
took total of 9.0050 seconds

In this example, each call to fetch_data waits for the previous one to complete. As a result, the total running time will be the sum of all the delays: 9 seconds in this case (3 seconds + 3 seconds + 3 seconds). This sequential execution is straightforward but can be inefficient for tasks that can be performed concurrently.

Example with Tasks:

Now, let’s see how using tasks can improve efficiency by allowing concurrent execution:

import asyncio

async def fetch_data(data_id: int) -> None:
print(f'Fetching data for ID {data_id}')
await asyncio.sleep(3) # Simulates waiting for a response from a server
print(f'Finished fetching data for ID {data_id}')

async def main() -> None:
# Create tasks for concurrent execution
task1 = asyncio.create_task(fetch_data(1))
task2 = asyncio.create_task(fetch_data(2))
task3 = asyncio.create_task(fetch_data(3))

# Await all tasks
await task1
await task2
await task3

asyncio.run(main())


>
Fetching data for ID 1
Fetching data for ID 2
Fetching data for ID 3
Finished fetching data for ID 1
Finished fetching data for ID 2
Finished fetching data for ID 3
took total of 3.0033 seconds

In this example, each fetch_data operation starts running as soon as it is created as a task. Since all three tasks run concurrently, the total running time is determined by the longest delay, which is 3 seconds. This demonstrates how tasks can significantly reduce the overall running time when multiple operations can be performed simultaneously.

By using tasks, we allow the event loop to manage multiple operations in parallel, making our program more efficient and responsive. In scenarios with extensive waiting or I/O-bound tasks, this approach can lead to substantial performance improvements.

Conclusion

This article covered the essentials of coroutines and tasks in Python, emphasizing their role in asynchronous programming. We explored the difference between normal functions and coroutines, demonstrated how to run coroutines with asyncio.run() and await, and highlighted how tasks enable concurrent execution. By leveraging tasks, we can significantly improve performance in I/O-bound scenarios, making applications more efficient and responsive.

--

--

Rachit Tayal
Python Features

Sports Enthusiast | Senior Deep Learning Engineer. Python Blogger @ medium. Background in Machine Learning & Python. Linux and Vim Fan