Asynchronous Programming in Java (Callable, Future, CompletableFuture and more)

13 min readMay 28, 2024

Introduction

As humans, we’re used to doing things one at a time, following a typical step-by-step process. So, when we write code, we follow the algorithm: doing the first task, then the next, and so on, maintaining a sequential approach. However, there are times when we initiate a long-running task and, while that task is still in progress, we can either sit and wait or do something else in the meantime.

Imagine you’re going to a shopping center to buy a new computer (or anything else, really). You’re a bit hungry, and there are plenty of places to order food. Let’s say you decide to order a pizza, but because there are many orders, it could take 25–30 minutes to be ready. What do you do? You could sit and wait, or you might be given a device that alerts you when the pizza is ready for pick-up. Meanwhile, you can browse the stores and look at different things while the pizza is being baked. Most of you would choose the latter, unless you’re very tired and hungry and just want to rest.

However, computers don’t get as tired as we do. So, most of the time, we aim to utilize their computing and processing power as much as possible and that’s what really happens behind the scenes, a lot of things are happening in the background!

In this article, I want to discuss why you might find an asynchronous approach beneficial in your code. I’ll cover different ways to implement it and examine the practical advantages and disadvantages.

P.S. In this article, I assume that you’re somewhat familiar with Java Threads, Thread Pools, ExecutorService, and context-switching.

Introduction to our resources and use case

For our coding examples, we’ll be utilizing an external RESTful API from https://api.snooker.org/. It’s worth noting that you’ll need to contact them (on the same page) to obtain your header value, but I’ll also provide a static JSON for each endpoint in case you need it.

We will start by retrieving all the played matches for a given snooker event (tournament) in the response, calling our first API: https://api.snooker.org/?t=9&e=1761.
In our case, the event has the ID 1761, and the response will look like this (I’ll include only one object to keep the article concise), but you can find the full response here: https://pastebin.com/KE1FzHrh.

[
    {
        "ID": 8561810,
        "EventID": 1761,
        "Round": 15,
        "Number": 1,
        "Player1ID": 101,
        "Score1": 2,
        "Walkover1": false,
        "Player2ID": 5,
        "Score2": 5,
        "Walkover2": false,
        "WinnerID": 5,
        "Unfinished": false,
        "OnBreak": false,
        "Status": 3,
        "WorldSnookerID": 0,
        "LiveUrl": "https://www.wst.tv/match-centre/5f1a7b6a-dddf-47bb-a51f-d5de100582bc",
        "DetailsUrl": "https://www.wst.tv/match-centre/5f1a7b6a-dddf-47bb-a51f-d5de100582bc",
        "PointsDropped": false,
        "ShowCommonNote": true,
        "Estimated": false,
        "Type": 1,
        "TableNo": 0,
        "VideoURL": "",
        "InitDate": "2024-02-06T11:55:34Z",
        "ModDate": "2024-03-06T22:20:46Z",
        "StartDate": "2024-03-06T19:56:39Z",
        "EndDate": "2024-03-06T22:07:28Z",
        "ScheduledDate": "2024-03-06T20:00:00Z",
        "FrameScores": "3-118 (95), 107-9, 81-0 (81), 29-94 (94)<br/>0-121 (121), 0-69, 0-131 (124)",
        "Sessions": "",
        "Note": "Referee: <a href=https://twitter.com/tatiana_referee>Tatiana Woollaston</a>",
        "ExtendedNote": "",
        "HeldOver": false,
        "StatsURL": ""
    }
]

The event started with 12 players, resulting in a total of 11 matches, with each match featuring two players.

In the response, we can observe that we have Player1ID and Player2ID for each object. However, we lack information regarding which players participated in the match, as we only have their IDs. Fortunately, the snooker.org API is very robust, allowing us to utilize another endpoint to retrieve player details by ID — https://api.snooker.org/?p=5.

The response of the above API returns a singleton list and looks like below:

[
    {
        "ID": 5,
        "Type": 1,
        "FirstName": "Ronnie",
        "MiddleName": "",
        "LastName": "O'Sullivan",
        "TeamName": "",
        "TeamNumber": 0,
        "TeamSeason": 0,
        "ShortName": "R O'Sullivan",
        "Nationality": "England",
        "Sex": "M",
        "BioPage": "https://snooker.org/plr/bio/rosullivan.shtml",
        "Born": "1975-12-05",
        "Twitter": "ronnieo147",
        "SurnameFirst": false,
        "License": "",
        "Club": "",
        "URL": "https://www.facebook.com/TheRealRonnieOSullivan",
        "Photo": "https://snooker.org/img/players/rosullivan.jpg",
        "PhotoSource": "",
        "FirstSeasonAsPro": 1992,
        "LastSeasonAsPro": 0,
        "Info": "Class of '92, OBE (2016)",
        "NumRankingTitles": 41,
        "NumMaximums": 15,
        "Died": ""
    }
]

From our observations, we can see that the second API (get player by ID) is dependent on the first API because the response of the first API contains the players’ IDs. Therefore, we must first call the first API to determine which players competed against each other, and then we can call the get players endpoint to retrieve the player data.

Now that we have our data and know what to do, let’s explore some of the ways we can consume these two endpoints.

Synchronous (Blocking) Approach

The first way we are going to consume the resources is in a synchronous or blocking approach. This is by far the most straightforward way of doing things, and it aligns with our usual thought process: executing the first instruction, then the second instruction… because most of the time, our code is synchronous.

So, the code looks something like this:

try (HttpClient httpClient = HttpClient.newBuilder().build()) {
        Integer eventId = 1761;
        List<GetMatchResponse> matchResponses = getMatchesOfAnEvent(httpClient, eventId);

        Set<Integer> playerIds = matchResponses.stream()
                .flatMap(match -> Stream.of(match.getPlayer1Id(), match.getPlayer2Id()))
                .collect(Collectors.toSet());

        List<GetPlayerResponse> playerResponses = new ArrayList<>();

        for (Integer playerId : playerIds) {
            playerResponses.add(getPlayerByIdSync(httpClient, playerId));
        }
        playerResponses.forEach(item -> System.out.println(item.getFirstName() + " " + item.getLastName()));
    }

We are using the java.net.http.HttpClient class, but feel free to use a different one because the result will be the same. So, delving line by line into the code, we see that:

We get the matches by event id
We identify the ids of the players that played in those matches
For each player, we call the 2nd API and get back their details
We print each player’s full name.

At first glance, there’s nothing wrong with the procedure above. However, let’s reconsider the use case.

To obtain the players, we need their IDs, and to acquire those IDs, we need the matches. So far, everything is good. However, there is a major improvement we can make here! The HTTP calls made for each player are independent of each other. Thus, they can be executed independently and their results joined together in the end to print their full names.

So, to summarize: We need to call the first API to retrieve the matches of an event and wait for the result. As soon as we have the result, we can then call the second API to obtain the players’ details. In a blocking approach, the total time to execute those 13 HTTP calls would be: matches_time + first_player_time + second_player_time, etc. However, using a non-blocking, asynchronous approach, we can reduce the total time.

To achieve this, we can make concurrent or parallel HTTP calls for each player. The ideal scenario would be to have a separate thread for each call to execute it. However, as the number of tasks increases, we might not be able to achieve 100% parallelization of these network tasks, but at least they would be executed concurrently across multiple cores.

Asynchronous (Non-Blocking) Approach using Callable and Future

Future… that’s such a fancy name for an interface, isn’t it? Well, if you’re familiar with Promises in JavaScript, then it’s kind of the same thing. However, if you’re not familiar with either of them, then you’ve come to the right place.

If you’re familiar with Java threads, you will know that they implement the Runnable interface, which has the run method that we need to override. When we override this method and the runnable starts, that code will be executed on a different thread. However, there’s a small flaw because it is void and cannot return a value. So, if we wanted to compute something or maybe execute an HTTP call and return the result, there’s no way we could get it because it is void.

Enter Callable!

Callable is similar to the Runnable interface in that it runs in the background on a different thread than the main thread. However, instead of returning nothing (void), it returns an object whose value can be read through a method call, hence the term ‘callable’.

So, now we know that to achieve our goal of writing asynchronous (non-blocking) code, we’ll need the following

Design a class that will implement the Callable<T> interface, which will run on a separate thread to make the HTTP call.
A thread pool to execute our code.

This may sound intimidating, but the good news is that there are easier ways and high-level classes to achieve our goal of writing non-blocking code, and we will cover them shortly.

So, our class could look like this:

static class HttpRequestCallable implements Callable<String> {
        private final HttpClient httpClient;
        private final HttpRequest httpRequest;

        public HttpRequestCallable(HttpClient httpClient, HttpRequest httpRequest) {
            this.httpClient = httpClient;
            this.httpRequest = httpRequest;
        }

        @Override
        public String call() throws Exception {
            return httpClient.send(httpRequest, HttpResponse.BodyHandlers.ofString()).body();
        }
    }

I have made it static because it’s inside my main method, for teaching purposes. The HttpClient will execute the request, and we will return the body of the response, which will later be deserialized into an Object. (Note that there are other alternatives to design this class).

So, in comparison with the blocking-mode approach, we still need to make the first call synchronously (to get the matches of an event). However, now we will execute our HTTP requests on players using different threads, requiring us to incorporate a thread pool into our code.

After implementing these changes, our code will appear as shown below. Let’s discuss it.

try (HttpClient httpClient = HttpClient.newBuilder().build();
             ExecutorService executorService = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors())) {
            Integer eventId = 1761;
            List<GetMatchResponse> matchResponses = getMatchesOfAnEvent(httpClient, eventId);

            Set<Integer> playerIds = matchResponses.stream()
                    .flatMap(match -> Stream.of(match.getPlayer1Id(), match.getPlayer2Id()))
                    .collect(Collectors.toSet());

            List<GetPlayerResponse> playerResponses = new ArrayList<>();
            List<Future<String>> playersFutures = new ArrayList<>();

            for (Integer playerId : playerIds) {
                Future<String> playerFuture = executorService.submit(
                        new HttpRequestCallable(httpClient, HttpRequest.
                                newBuilder(URI.create(
                                        "https://api.snooker.org/?p=%d".formatted(playerId)
                                )).setHeader(HEADER_NAME, HEADER_VALUE).GET().build())
                );
                playersFutures.add(playerFuture);
            }
            while (playersFutures.stream().noneMatch(Future::isDone)) {
                // Do something while all the threads are done
            }

            for (var future : playersFutures) {
                playerResponses.addAll(gson.fromJson(future.get(),
                        new TypeToken<>() {
                        }));
            }
            playerResponses.forEach(item -> System.out.println(item.getFirstName() + " " + item.getLastName()));
        }

We have created a thread pool with as many threads as cores available on our machine, which we can obtain using: Runtime.getRuntime().availableProcessors().

We will submit callable instances into this thread pool (using executorService.submit) and, in return, we will receive a Future for each callable we submit.

The pizza analogy for this is that you submit an order (the callable) to the order machine (executorService), and in return, you receive a device that alerts you when the pizza is ready. In a similar manner, this device resembles your Future object. Just as the pizza is not baked immediately, your task is not executed immediately. In our case, it’s an HTTP call, but in another scenario, it could be a heavy database operation that takes dozens of seconds to complete. In this situation we can either:

Block the main thread until the requests are completed and the responses are retrieved (by calling the get method on the Future instance). It's still not too bad because those calls will be executed on different threads, and the total execution time will be the maximum of (player_1_request, player_2_request..., player_n_request) (on the blocking approach we had the sum of them).
Poll for completion status using the linear or procedural flow by calling the isDone method on the Future instance (inside the while loop). Here, we are employing a "greedy" approach by performing as many operations as possible while waiting for all the requests to finish.

while (playersFutures.stream().noneMatch(Future::isDone)) {
                // Do something while all the threads are done
            }

So, once we know that the requests have been executed, we can proceed to execute the next block of code to deserialize the response JSON body into our object. It’s worth noting that I’m using Google’s Gson for serialization/deserialization, but you can use any other library as the result will be the same.

for (var future : playersFutures) {
                playerResponses.addAll(gson.fromJson(future.get(),
                        new TypeToken<>() {
                        }));
            }
playerResponses.forEach(item -> System.out.println(item.getFirstName() + " " + item.getLastName()));

While this approach is much better than the blocking one, there is still room for improvement. Even though we are executing the requests in different threads, we still need to check if all of them have finished before calling the get() method. Since the get() method of a Future instance is blocking, if a request has just been initiated and takes, say, 4 seconds to complete, and we immediately call get(), then the calling thread (in our case, the main thread) will block until the result is ready. This is not ideal; we want this task to be executed on a different thread and retrieve its result only when we know that it has already finished (by checking with the isDone method).

If you’ve been pondering callbacks this whole time, then you’re absolutely right. While the Future approach is much better than the blocking one (for this scenario), we still need to manually check if the task has finished so we can deserialize it and perform n operations. If these Future instances were more futuristic (ba dum tss), we would have been able to provide a callback function to handle the deserialization and as many other tasks as we wish. In the end, we’d have the result ready, just like a freshly baked pizza.

If you’re unfamiliar with callbacks, think of it this way: you’re telling the JVM that when my HTTP response is retrieved, then deserialize it into an object and add that object to this list — while I am doing other things in the code. Or, with the pizza analogy, it’s like saying when my pizza is baked, please add some spicy sauce and maybe some other sauce.

Lucky for us, we have the high-level class that will let us do this :)

Asynchronous (Non-Blocking) Approach using CompletableFuture

CompletableFuture helps us create and manage asynchronous operations, making code more readable. We no longer have to write all those Future and Callable instances because the CompletableFuture class already implements the Future interface and also the CompletionStage interface, which allows us to add callbacks. In my opinion, it’s a major revolution in the Java language and should be our go-to class for asynchronous operations.

So let’s see how we can write a CompletableFuture. If you are using HttpClient from java.net.http, you can send both synchronous and asynchronous HTTP calls. The latter returns a CompletableFuture<HttpResponse<T>>. First, let’s send a synchronous HTTP call with our client and wrap the response in a CompletableFuture.

public static CompletableFuture<HttpResponse<String>> getPlayerById(HttpClient httpClient, Integer id) {
        HttpRequest playerRequest = HttpRequest.
                newBuilder(URI.create(
                        "https://api.snooker.org/?p=%d".formatted(id)
                )).setHeader(HEADER_NAME, HEADER_VALUE).GET().build();
        return CompletableFuture.supplyAsync(() -> {
            try {
                return httpClient.send(
                        playerRequest, HttpResponse.BodyHandlers.ofString()
                );
            } catch (Exception e) {
                // handle exceptions your way
                throw new RuntimeException(e);
            }
        });
    }

Using the static method CompletableFuture.supplyAsync, we can supply a task or function, and we will be returned a CompletableFuture or Promise, guaranteeing that at some point in the future, a result of type T or an exception will be available. This model provides a clear separation between the initiation of an asynchronous operation and the consumption of its result. The code runs asynchronously on a different thread. However, if we’re using java.net.http.HttpClient we can directly send an async request which returns a CompletableFuture for us.

public static CompletableFuture<HttpResponse<String>> getPlayerById(HttpClient httpClient, Integer id) {
        HttpRequest playerRequest = HttpRequest.
                newBuilder(URI.create(
                        "https://api.snooker.org/?p=%d".formatted(id)
                )).setHeader(HEADER_NAME, HEADER_VALUE).GET().build();
        return httpClient.sendAsync(
                playerRequest, HttpResponse.BodyHandlers.ofString()
        );
    }

Let’s have a look at our main method now.

try (HttpClient httpClient = HttpClient.newBuilder().build()) {
            Integer eventId = 1761;

            List<GetMatchResponse> matchResponses = getMatchesOfAnEvent(httpClient, eventId);
            Set<Integer> playerIds = matchResponses.stream()
                    .flatMap(match -> Stream.of(match.getPlayer1Id(), match.getPlayer2Id()))
                    .collect(Collectors.toSet());

            List<GetPlayerResponse> playerResponses = new ArrayList<>();
            List<CompletableFuture<Void>> playerCFs = new ArrayList<>();

            for (Integer playerId : playerIds) {
                CompletableFuture<Void> playerCF = getPlayerById(httpClient, playerId)
                        .thenApply(response -> gson.fromJson(response.body(), new TypeToken<ArrayList<GetPlayerResponse>>(){}))
                        .thenAccept(playerResponses::addAll)
                                .thenRun(() -> System.out.println("Woah, we are done!"));
                playerCFs.add(playerCF);
            }

            CompletableFuture.allOf(playerCFs.toArray(new CompletableFuture[0])).join();
            playerResponses.forEach(item -> System.out.println(item.getFirstName() + " " + item.getLastName()));
        }

So, unlike with the Future interface, we don’t need a thread pool where we can submit tasks and have them executed. While the HTTP call for the first API (matches of an event) remains the same as before, let’s take a closer look at this block of code:

for (Integer playerId : playerIds) {
                CompletableFuture<Void> playerCF = getPlayerById(httpClient, playerId)
                        .thenApply(response -> gson.fromJson(response.body(), new TypeToken<ArrayList<GetPlayerResponse>>(){}))
                        .thenAccept(playerResponses::addAll)
                                .thenRun(() -> System.out.println("Woah, we are done!"));
                playerCFs.add(playerCF);
            }

This looks like synchronous code. We are sending the HTTP request and then chaining the callbacks one by one, very declaratively compared to the imperative approach we had with the Future instances.

We are basically saying that once my response is ready (of type HttpResponse<String>), then take the body of it and deserialize it into an ArrayList<GetPlayerResponse> (the external API returns a singleton list rather than a single object). Once it has been deserialized into an object (ArrayList<GetPlayerResponse>), add this entire list to our playerResponses list. Finally, print ‘Woah, we are done!’ .

Do you agree that this approach is much easier and better than having to create Future instances, manually check if they are done, and then manipulate the results? Here, we just provide callbacks, and all these operations will be done in the background for us! When we call the .thenAccept and .thenRun methods, they return void, hence CompletableFuture<Void> playerCF. In the end, all we need to do is join the results (wait for all of them to be finished), and then we can print the results.

 CompletableFuture.allOf(playerCFs.toArray(new CompletableFuture[0])).join();
            playerResponses.forEach(item -> System.out.println(item.getFirstName() + " " + item.getLastName()));

CompletableFuture.allOf treats all the completable futures as one and blocks the main thread until all of them have been executed. Just like in the future approach, we have reduced the execution time of 12 http calls from
sum(player_1_time, player_2_time, player_3_time…, player_n_time) to max(player_1_time, player_2_time, player_3_time…, player_n_time) .

Conclusion

We have seen the difference between writing our code in a synchronous and asynchronous manner. Usually, most of the time, we think in a synchronous way in our day-to-day lives, and so we do when we write code. However, there are cases where it might be better to take advantage of using multiple threads and executing things concurrently and in parallel.

I have been running my experiments on my MacBook Pro M1 with 8 cores and achieved the following results for this scenario:

The synchronous execution of the program took around 3.2 seconds.
The synchronous execution of the program took around 1.1–1.3 seconds.

We can clearly see that in a situation like this, where you have independent tasks, you can take advantage of executing your tasks in different threads. However, as we have seen, this logic is different, harder, and more intimidating than writing synchronous code, which is what we usually do.

In my opinion, you should follow this approach only when you really need to. Sometimes, you might do all this work just to save 100–200 milliseconds, especially when the task is not very suitable to be run asynchronously or in parallel. As we know, in software engineering, almost everything is a trade-off, and there’s no perfect software. For example, if you had a scenario where you perform five simple gets on the database, you might think you could run all these tasks asynchronously if they are unrelated to each other. However, in this case, you would instantly occupy five connections from the connection pool, and in the end, you would save 50–60 milliseconds at most, sacrificing five connections at once that could have been used by five different requests.

If you have any questions, don’t hesitate to contact me. To deepen your knowledge on this topic, consider reading: