Scaling Up IO Tasks in Java

Rahul Saha
Nov 19, 2020 · 4 min read
Photo by Agê Barros on Unsplash

If you have worked on an application that works with external services like Database, MQ, Web APIs, etc, or even interacted with the file system on a larger scale, you are probably aware of how IO, especially over the network, slows operations down.

There is nothing wrong with IO. But IO-bound operations are generally drastically slower than CPU bound operations. We measure CPU operations in microseconds while IO operations are generally measured in milliseconds.

This slowness is not noticeable on a small scale when a handful of IO calls are made. But in an enterprise-grade application might need to process millions of data within a fixed time frame, IO is generally the slowest part and probably the deciding factor of the performance of the application. So it becomes crucial to make the most of IO.

Suppose for a business scenario we need to do 3 database calls. The calls take 1, 2, and 3 seconds respectively. And this process needs to be repeated 5 times.
Here we use Thread.sleep()to simulate a long-running IO call. This is a blocking call and the thread is stuck for the given time frame, just like an IO call.

A naive solution

Do the DB calls sequentially in a single thread (main thread).

Output:

Completed db call #1
Completed db call #2
Completed db call #3
Completed db call #4
Completed db call #5
Completed db call #6
Completed db call #7
Completed db call #8
Completed db call #9
Completed db call #10
Completed db call #11
Completed db call #12
Completed db call #13
Completed db call #14
Completed db call #15
completed IO calls in 30045 ms

The whole process took 30 seconds which is quite expected. Because all the calls are blocking and we are doing them in a single (main) thread.

Of course, we can do better.

Async in action!

There is no point in waiting synchronously for these blocking operations to complete. So the operations can be done asynchronously. This will allow us to do the next IO operation without waiting for the previous one to complete.

This should speed things up as we will not be doing operations sequentially but in parallel. However, this requires the IO callee to support parallelism. Luckily database, MQ, and web services generally do support parallelism.

In java, we can use CompletableFuture to do the operation asynchronously. CompletableFuture uses the common thread pool ForkJoinPool with as many threads as CPU cores in the machine.

Output:

Completed db call #1
Completed db call #2
Completed db call #3
Completed db call #4
Completed db call #5
Completed db call #6
Completed db call #7
Completed db call #8
Completed db call #9
Completed db call #10
Completed db call #11
Completed db call #12
Completed db call #13
Completed db call #14
Completed db call #15
completed IO calls in 9068 ms

Now it takes just 9 seconds! That’s quite an improvement.

But is there room for further improvement? As a matter of fact, yes!

Async with dedicated ThreadPool

For CPU-bound tasks, there is a limitation on the number of threads. There can be actively running threads at most as the number of CPU cores. Using more threads will have a negative impact.

IO-bound tasks have no such limitation. We can choose a number of threads as per our requirement and it can be much higher than the number of CPU cores.

CompletableFuture by default uses the common ForkJoinPool. This thread pool has a core pool size of Runtime.getRuntime().availableProcessors(). That is the maximum number of threads that can run in parallel. But our operations are mostly IO-bound and don’t use the CPU much. So we can deploy more threads. The number of threads to use is dependent on performance requirements and available system resources.

In my machine, there are 4 cores so 4 threads were used previously. I am going to deploy 16 threads now (In the next section I will discuss a more sophisticated way to estimate the number). So let us try with a dedicated thread pool with 16 threads.

Output:

Completed db call #1
Completed db call #5
Completed db call #4
Completed db call #3
Completed db call #2
Completed db call #6
Completed db call #7
Completed db call #8
Completed db call #9
Completed db call #10
Completed db call #11
Completed db call #12
Completed db call #13
Completed db call #14
Completed db call #15
completed IO calls in 3038 ms

As you can see the whole operation is completed in just about 3 seconds. This is pretty much the best we can achieve as the longest taking DB call takes 3 seconds. The number of threads should be ≤ connection pool size (this is assuming one connection exclusively per thread which is the generally accepted design).

Calculating the optimal number of threads

In real scenarios, it is not exclusively IO or CPU bound, but a mix of two. Where threads spend some percentage of their time in CPU and some in IO. In these scenarios, it becomes tricky to calculate the number of threads.

In order to calculate the ideal number of threads for an IO-bound task, we can use the following formula provided by Brian Goetz. If the threads spend S units of service time (running and utilizing CPU) and W units of waiting time (blocked in IO operation) and there are N processor cores, then

number of threads = N * (1 + W/S)

Conclusion

IO operations are by far the slowest things in an application but with non-blocking asynchronicity, applying parallelism and tuning, throughput can be dramatically improved, and also it can be made scalable with increasing load.

The Startup

Get smarter at building your thing. Join The Startup’s +745K followers.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +745K followers.

Rahul Saha

Written by

A passionate software engineer from Kolkata, India. Also a linux enthusiast, photographer and accidental drawing artist.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +745K followers.