Exploring Project Loom: A Revolution in JVM Concurrency

Uğur Atçı
Trendyol Tech
Published in
10 min readJun 15, 2023
image is taken from https://wiki.openjdk.org/display/loom/Main

As the Export Center at Trendyol, we primarily work on Rest API and Kafka consumer and producer projects, and we have established a standard with teammates; if we implement consumer or producer projects, we are using Golang for low resource consumption and high throughput.

Another standard we follow is using Kotlin and the Spring Framework for API development if we implement our business logic. Previously, we mostly used Java, but like many other teams, we also enjoy writing code in Kotlin.

Spring boot has two operation models; blocking and non-blocking. In the blocking model, the request is made to a Spring Boot application, and the thread handling that request will block until a response is generated and sent back to the client. During this blocking period, the thread cannot handle other requests. We can use synchronous database drivers(PostgreSQL, Mssql, Redis), where each request to the database blocks the executing thread until the response is received. This approach simplifies the codebase and allows straightforward transaction management using the traditional Spring Data JPA or JDBC templates. When interacting with external services, such as HTTP APIs for other domain APIs, blocking IO can be a pragmatic choice. Blocking IO with synchronous service clients allows for straightforward request/response handling, where each call blocks the thread until the response is received. This can be suitable for scenarios where the service calls are expected to be fast, and the application can afford to wait for the response without blocking other requests.

Starting from Spring Framework 5 and Spring Boot 2, there is support for non-blocking operations through the integration of the Reactor project and the introduction of the WebFlux module. With WebFlux, we can build reactive, non-blocking applications using reactive Netty runtime. WebFlux is designed to handle a large number of concurrent requests efficiently. It uses non-blocking IO to process requests asynchronously, allowing better utilization of system resources and improved scalability. As the Export Center team, we are also experiencing some problems here; adopting reactive programming and understanding the reactive streams model can have a steeper learning curve than the traditional blocking IO approach. We must understand concepts like reactive types (Flux and Mono) and how to handle backpressure. Also, not all existing libraries, frameworks, or databases are designed for reactive programming or provide reactive counterparts. In cases where we need to integrate with legacy systems or libraries that are blocking, we may need to bridge between reactive and blocking code, which can add complexity.

As the Export Center team, we are searching for an easy-to-learn and easy-to-apply application with less JVM thread management. Enter Project Loom, an ambitious open-source initiative aiming to revolutionize concurrency. In this article, we’ll delve into the world of Project Loom, exploring its goals, benefits, and potential impact on JVM-based development.

Project Loom aims to drastically reduce the effort of writing, maintaining, and observing high-throughput concurrent applications that make the best use of available hardware.

— Ron Pressler (Tech lead, Project Loom)

Understanding Project Loom

Project Loom, led by the OpenJDK community, aims to introduce lightweight concurrency primitives to JVM-based languages, offering developers a new programming model called virtual threads, or fibers. Unlike traditional threads, virtual threads are lightweight and highly scalable, enabling the creation of millions of threads without excessive resource consumption. The underlying goal is to make highly concurrent programming in these languages simpler, more efficient, and less error-prone. Virtual Threads (or Fibers) can essentially scale to hundreds-thousands or millions, whereas good, old OS-backed JVM threads only could scale to a couple of thousand.

To discuss Virtual Threads, we have to get familiar with a few basic concepts: Fibers, Continuations, and of course, Virtual Threads.

Fibers, also known as virtual threads, are a core concept introduced by Project Loom. Fibers provide a lightweight, user-space concurrency mechanism for the execution of concurrent tasks with minimal overhead. They are designed to be highly scalable, enabling the creation of millions of fibers without consuming excessive system resources.

Continuations are used to implement virtual threads (fibers), enabling efficient scheduling and execution. Each fiber is associated with a continuation, which captures the fiber’s execution state when it gets suspended and allows it to be resumed later. When a fiber encounters a blocking operation, such as waiting for I/O or accessing a synchronized resource, it can suspend itself by capturing its continuation and saving its execution context. The underlying fiber scheduler can then schedule other fibers for execution, avoiding unnecessary thread blocking and resource wastage. Once the blocking operation is completed or the resource becomes available, the fiber’s continuation is invoked, and it resumes execution from the point where it left off. Continuations provide a powerful mechanism for managing the flow of execution in concurrent programs. They enable fibers to yield control without blocking threads, allowing for efficient utilization of system resources. This cooperative multitasking model, based on continuations, helps mitigate the scalability limitations and overhead associated with traditional thread-based concurrency.

It’s important to note that Project Loom and the concepts are still under development at the time of writing. The APIs and features may evolve as Project Loom progresses.

What are Java Virtual Threads?

This is quite similar to coroutines, like goroutines, made famous by the Go programming language (Golang).

  • Virtual Thread
Thread.startVirtualThread(() -> {
System.out.println("Hello, Virtual Thread!");
});
  • Goroutine
go func() {
println("Hello, Goroutine!")
}()
  • Kotlin Coroutine
runBlocking {
launch {
println("Hello, Kotlin coroutine!")
}
}

The Advantages of Virtual Threads

One of the key advantages of virtual threads is their lightweight nature. Traditional threads consume significant memory and entail high context-switching overhead. In contrast, virtual threads are far more efficient, allowing developers to create and manage many concurrent tasks without exhausting system resources. This scalability is particularly beneficial for applications requiring massive concurrency handlings, such as web servers or event-driven frameworks.

We can compare to Kotlin coroutines, Java Threads, and Loom virtual threads.

  • Kotlin coroutines
suspend fun main(args: Array<String>)  {

measureTime {
supervisorScope {
repeat(100_000) {
launch(Dispatchers.IO) {
blockingHttpCall()
}
}
}
}.also { println("Dispatcher.io completed, time: $it") }

}

fun blockingHttpCall(){
Thread.sleep(100)
}

In this code block, launch blocking calls and how long it takes. Inside the measureTime function, we have a supervisorScope block. supervisorScope is a coroutine builder that creates a new coroutine scope and ensures that any exceptions occurring in child coroutines do not cancel the entire scope.

Within the repeat(100_000) loop, we launch 100,000 coroutines using the launch builder. The launch builder launches a new coroutine and assigns it to the Dispatchers.IO context, which is a dispatcher optimized for performing I/O operations.

Inside each launched coroutine, we call the blockingHttpCall() function. This function represents a blocking HTTP call and suspends the coroutine for 100 milliseconds using Thread.sleep(100). This simulates a time-consuming operation, such as making an HTTP request.

Execution time :

Dispatcher.io completed, time: 2m 42.048028042s
  • Java Threads
public class Main {
public static void main(String[] args) throws InterruptedException, ExecutionException {
int numberOfTasks = 100_000;

ExecutorService executorService = Executors.newFixedThreadPool(64);

long startTime = System.currentTimeMillis();

CompletionService<Void> completionService = new ExecutorCompletionService<>(executorService);

for (int i = 0; i < numberOfTasks; i++) {
completionService.submit(() -> {
blockingHttpCall();
return null;
});
}

for (int i = 0; i < numberOfTasks; i++) {
completionService.take().get();
}

long endTime = System.currentTimeMillis();
long elapsedTime = endTime - startTime;

System.out.println("Java threads completed, time: " + elapsedTime);

executorService.shutdown();
}

public static void blockingHttpCall() throws InterruptedException {
Thread.sleep(100);

}
}

In this code snippet, we create an ExecutorService using Executors.newFixedThreadPool() with a pool size of 64 threads. This represents a thread pool where up to 64 tasks can be executed concurrently. (It is same value Kotlin Dispatchers.IO IO_PARALLELISM_PROPERTY_NAME )

We use the CompletionService class to submit each task. CompletionService is a wrapper around the executor service that allows us to retrieve completed tasks in the order of completion. In this case, we create a CompletionService<Void> instance using the executor service.

Inside the loop, we submit each task using completionService.submit(). The task is defined as a lambda expression that calls the blockingHttpCall() method. The lambda returns null since the CompletionService expects a Callable or Runnable that returns a result.

After submitting all the tasks, we iterate again numberOfTasks times and use completionService.take().get() to retrieve the completed tasks. The take() method blocks until a task is completed, and get() returns the result of the completed task. Since we don't have a result to return (Void type), we ignore it.

We measure the elapsed time by calculating the difference between the start and end times. Finally, we print the completion time and call executorService.shutdown() to shut down the executor service.

Execution time :

Java threads completed, time:: 161956.000.  

// 161,956 Milliseconds = 2.6992667 Minutes
Threads in running Java Executor Service
  • Loom Virtual Threads
@OptIn(ExperimentalTime::class)
suspend fun main(args: Array<String>) {
measureTime {
supervisorScope {
repeat(100_000) {
launch(Dispatchers.LOOM) {
blockingHttpCall()
}
}
}
}.also { println("Dispatcher.loom completed, time: $it") }
}

val Dispatchers.LOOM: @BlockingExecutor CoroutineDispatcher
get() = Executors.newVirtualThreadPerTaskExecutor().asCoroutineDispatcher()

fun blockingHttpCall(){
Thread.sleep(100)
}

The measureTime function measures the execution time of the block of code inside it. Inside the supervisorScope, we repeat the execution of the block 100,000 times. Each iteration launches a new virtual thread using launch and executes the blockingHttpCall function. The Dispatchers.LOOM property is defined to provide a CoroutineDispatcher backed by a virtual thread executor. It uses Executors.newVirtualThreadPerTaskExecutor() to create an executor that assigns a new virtual thread to each task. The asCoroutineDispatcher() extension function converts the executor to a CoroutineDispatcher object. The blockingHttpCall function simply sleeps the current thread for 100 milliseconds to simulate a blocking operation.

Execution time :

Dispatcher.loom completed, time: 671.839500ms

//671 Milliseconds = 0.011183333 Minutes
Threads in running Project Loom Virtaul Thread

Comparing the execution times:

  • Kotlin Coroutines and Java Threads have similar execution times, both around 2 minutes and 41–42 seconds. This is expected because they both utilize a thread pool with a fixed number of threads to execute the tasks. However, Kotlin Coroutines provide a more concise and structured way of writing concurrent code compared to Java Threads.
  • Loom Virtual Threads significantly improve execution time, completing the tasks in approximately 671 milliseconds (less than a second). This is much faster compared to both Kotlin Coroutines and Java Threads. Loom Virtual Threads leverage lightweight, highly concurrent virtual threads, allowing for the efficient execution of large numbers of tasks without thread pooling.

The key difference between the two Kotlin examples (coroutines and virtual threads) is that the blocking function directly uses Thread.sleep(), which blocks the thread. If we compare fair condition, we need to use non-blocking function. While the non-blocking function uses delay() from the Kotlin coroutines library, which suspends the coroutine without blocking the thread, allowing other tasks or coroutines to proceed concurrently.

Kotlin coroutines with non-blocking call

@OptIn(ExperimentalTime::class)
suspend fun main(args: Array<String>) {

measureTime {
supervisorScope {
repeat(10_000) {
launch(Dispatchers.IO) {
nonBlockingIO()
}
}
}
}.also { println("Dispatcher.io completed, time: $it") }

}

suspend fun nonBlockingIO(){
// Simulating a time-consuming task
delay(100)
}

The nonBlockingIO function is called within each coroutine. It uses delay(100) to simulate a time-consuming task that suspends the coroutine for 100 milliseconds without blocking the underlying thread.

Execution time :

Dispatcher.io completed, time: 222.506375ms

Loom Virtual Threads with non-blocking call

@OptIn(ExperimentalTime::class)
suspend fun main(args: Array<String>) {
measureTime {
supervisorScope {
repeat(10_000) {
launch(Dispatchers.LOOM) {
nonBlockingIO()
}
}
}
}.also { println("Dispatcher.loom completed, time: $it") }
}

val Dispatchers.LOOM: @BlockingExecutor CoroutineDispatcher
get() = Executors.newVirtualThreadPerTaskExecutor().asCoroutineDispatcher()

suspend fun nonBlockingIO(){
// Simulating a time-consuming task
delay(100)
}

In the function, nonBlockingIOwill run on virtual threads instead of the default IO dispatcher.

Execution time :

Dispatcher.loom completed, time: 183.138625ms

Overall, Loom Virtual Threads demonstrate a significant performance and resource utilization advantage, providing a more scalable and efficient solution for concurrent programming compared to traditional Java thread approaches. Combining coroutines with virtual threads can provide a powerful concurrency solution. We have seen an execution time of 183 ms. Coroutines can express fine-grained concurrency within a virtual thread, enabling developers to write more sequential and readable code while leveraging the efficient execution environment provided by virtual threads. The combination allows for structured concurrency and improved resource utilization while achieving better performance and scalability.

Compatibility and Integration

Project Loom aims to integrate virtual threads into existing Java frameworks and APIs seamlessly. By design, the goal is to ensure compatibility with existing thread-based libraries and frameworks, allowing developers to leverage the benefits of virtual threads without requiring extensive code modifications. This compatibility-driven approach enables a smooth transition to Project Loom, making it easier for developers to adopt and benefit from this new concurrency model.

Starting from Spring Framework 6 (and Spring Boot 3), the virtual-thread feature is officially in general availability, but virtual threads are a preview feature of Java 19. This means we need to tell the JVM we want to enable them in our application.

java --enable-preview --release 19 SpringBootMain.java
@EnableAsync
@Configuration
open class ThreadConfig {
@Bean
open fun applicationTaskExecutor(): AsyncTaskExecutor {
return TaskExecutorAdapter(Executors.newVirtualThreadPerTaskExecutor())
}

@Bean
open fun protocolHandlerVirtualThreadExecutorCustomizer(): TomcatProtocolHandlerCustomizer<*> {
return TomcatProtocolHandlerCustomizer { protocolHandler: ProtocolHandler ->
protocolHandler.executor = Executors.newVirtualThreadPerTaskExecutor()
}
}
}

The applicationTaskExecutor bean is defined as an AsyncTaskExecutor, which is responsible for executing asynchronous tasks. The executor is configured to use Executors.newVirtualThreadPerTaskExecutor(), which creates a thread executor that assigns a new virtual thread to each task. This ensures that the tasks are executed using virtual threads provided by Project Loom.

The protocolHandlerVirtualThreadExecutorCustomizer bean is defined to customize the protocol handler for Tomcat. It returns a TomcatProtocolHandlerCustomizer, which is responsible for customizing the protocol handler by setting its executor. The executor is set to Executors.newVirtualThreadPerTaskExecutor(), ensuring that Tomcat uses virtual threads for handling requests.

By including this configuration class in your Spring Boot application, you enable asynchronous processing and configure the thread executor to use virtual threads. This allows your application to benefit from the concurrency advantages provided by Project Loom.

Conclusion

Project Loom represents a significant step forward in JVM concurrency. Introducing lightweight virtual threads aims to simplify the development of highly concurrent applications while improving performance and scalability. Developers can look forward to the future as Project Loom continues to evolve. Stay tuned for the latest updates on Project Loom, as it has the potential to reshape the way we approach concurrency in JVM-based development. We will plan each of our services above Spring Boot 3.0 and make them work with JDK 19, so we can quickly adapt to virtual threads.

Resources

Be a part of something great! Trendyol is currently hiring. Visit the pages below for more information and to apply.

--

--