Jetty: Navigating Thread Starvation in Threadpool-Based Servers

7 min readNov 25, 2023

In the dynamic world of web servers, threadpool-based architectures like those in Jetty, IIS, and Tomcat play a crucial role in efficiently managing web requests. However, these servers can face significant challenges when thread starvation occurs. Thread starvation happens when all available threads in the pool are occupied, leaving no room for new requests, leading to various issues, from performance degradation to complete service outages.

This is the second in a series of articles about managing pool health in threadpool-based servers. Here we look at scenarios that can cause threadpool starvation in Jetty and ways to mitigate them, including circuit breaker middleware that load sheds when the threadpool is near capacity.

Jetty: The Agility and Its Pitfalls

Jetty, known for its lightweight nature and high performance, excels in handling a large number of simultaneous short-lived requests. However, it can struggle under the weight of thread starvation.

Thread Starvation in Jetty: A Scenario

Imagine a scenario where Jetty is deployed in an e-commerce application. Long-running requests, such as complex database queries for product recommendations, consume all available threads during peak shopping. New customer requests for product pages or checkout processes get queued, leading to slow response times or even timeouts. This not only affects user experience but can also result in lost sales.

Jetty’s Request Handling Mechanism

Jetty operates on a threadpool architecture where each incoming HTTP request is assigned to a thread from the pool. This model is typically efficient as it allows concurrently handling multiple requests, assuming each request can be processed relatively quickly.

Threadpool Configuration in Jetty

Jetty’s threadpool is configurable, allowing administrators to set the maximum number of threads based on expected traffic and server capacity. The configuration might look something like this:

QueuedThreadPool threadPool = new QueuedThreadPool();
threadPool.setMaxThreads(200);
Server server = new Server(threadPool);

In this example, Jetty is configured to handle 200 concurrent requests.

The Flow of a Request in Jetty

Incoming Request: A customer accesses the e-commerce site, triggering an HTTP request to Jetty.
Thread Allocation: Jetty allocates a thread from its pool to handle this request. If the threadpool is depleted, requests are added to a queue.
Request Processing: The thread processes the request, which may involve executing a database query for product recommendations.

The Challenge During Peak Periods

During peak shopping times, such as holiday sales, concurrent users on the site increase significantly. This surge can lead to scenarios where the threadpool is stretched to its limits.

Scenario of Threadpool Saturation

Long-Running Queries: Complex queries for personalized product recommendations are computationally intensive and take longer.
Thread Occupation: These long-running requests occupy a thread for an extended duration.
Threadpool Exhaustion: The available threads are quickly utilized as more such requests come in.

Consequences of Saturation

Queuing of Incoming Requests: New customer requests, like accessing product pages or initiating checkout processes, are forced to wait in the queue. If the threads in the threadpool are deadlocked, nothing short of a server restart can fix the situation. All queued requests will be lost forever, resulting in a very poor user experience.
Slow Response Times: The queued requests experience slow processing, leading to a noticeable lag in page loading and transaction processing.
Timeouts and Errors: In extreme cases, requests may time out or fail, leading to error messages being displayed to the customers.

The Impact on the E-Commerce Application

Poor User Experience: Customers need more time to avoid delays and glitches, leading to frustration and a negative shopping experience.
Potential Loss of Sales: Prolonged wait times and failures in processing transactions can result in customers abandoning their carts, directly impacting sales.

Typical Mitigation Strategies

To address these challenges, the following strategies can be employed:

Optimizing Database Queries: Ensuring that the queries for product recommendations are as efficient as possible can reduce their execution time.
Implementing Caching Mechanisms: Caching frequently accessed data reduces the need for repeated long-running database queries.
Asynchronous Processing: Modifying the application to handle long-running operations asynchronously can help in freeing up threads for other requests.
Monitoring and Auto-Scaling: Implement real-time monitoring to track threadpool usage and set up auto-scaling to adjust the number of threads dynamically based on demand.

A Real-World Challenge: The Refresh Dilemma

Consider a typical web application scenario where a user accesses a page that triggers a long-running query on the server, such as retrieving extensive data from a database. The user, impatient with the wait time, decides to hit the refresh button. This action cancels the original request and closes the client’s socket connection, but here’s where the crux of the problem lies for servers like Jetty and Tomcat.

The Java Limitation

In Java-based servers like Jetty and Tomcat, when a client closes a connection (like pressing refresh or closing the browser), the server-side handler doesn’t immediately get notified of this disconnection. In Java, the only way to detect if a socket is closed is to write to it, at which point an exception will be thrown. However, such writing must occur only when the request is completed so effectively there is no way to detect a closed socket.

In contrast, IIS, with its .NET framework, can use cancellation tokens to handle such scenarios more gracefully.

The Compounding Effect

When the user presses refresh, the following happens:

Original Request Continues: The server continues processing the initial request, unaware that the client is no longer waiting for the response.
New Request Initiated: Simultaneously, a new request is made to the server, and a new thread is allocated to handle it.
Threadpool Strain: If multiple users repeat this behavior (hitting refresh on long-running operations), it can rapidly consume available threads.
Zombie Handlers: The server ends up with numerous ‘zombie’ handlers — threads processing requests where the results will never be used because the client has already disconnected.

Resulting Server Overload

This scenario can quickly lead to threadpool exhaustion, severely impacting the server’s ability to handle new, legitimate requests. The server’s resources are tied up in processing data that will ultimately be discarded, leading to inefficient operations and potential server downtime.

Active Load Shedding in Jetty

There are several ways to trigger load shedding in Jetty, notifying callers that the server is running at capacity. By doing so, downstream load balancers and callers have the option to reroute requests to other instances or retry at a later time.

Configuring MaxThreadPoolSize

When Jetty’s threadpool is exhausted, incoming requests are queued up. We can specify an upper bound to the queue size to avoid accepting many pending requests. When the queue is full, Jetty will reject additional requests with a 503 Service Unavailable error code.

import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.util.thread.QueuedThreadPool;

public class JettyServer {
    public static void main(String[] args) throws Exception {
        // Create a queued thread pool with a specific max queue size
        QueuedThreadPool threadPool = new QueuedThreadPool();
        threadPool.setMaxThreads(200);  // set your max threads
        threadPool.setMaxQueueSize(50); // set your max queue size

        // Create a server with the configured thread pool
        Server server = new Server(threadPool);

        // Configure the rest of your server (handlers, connectors, etc.)
        // ...

        // Start the server
        server.start();
        server.join();
    }
}

Jetty Circuit Breaker Middleware

Another approach is to use a circuit breaker middleware for Jetty that intercepts incoming requests, checks the threadpool status, and decides whether to process the request or return an 503 Service Unavilableerror. This approach is acceptable when low latency is essential. We don’t want to queue up requests but would prefer to reroute them to other instances.

import org.eclipse.jetty.server.Handler;
import org.eclipse.jetty.server.Request;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.server.handler.AbstractHandler;
import org.eclipse.jetty.util.thread.QueuedThreadPool;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;

public class CircuitBreakerHandler extends AbstractHandler {
    private final QueuedThreadPool threadPool;
    private final int threshold;

    public CircuitBreakerHandler(QueuedThreadPool threadPool, int threshold) {
        this.threadPool = threadPool;
        this.threshold = threshold;
    }

    @Override
    public void handle(String target, Request baseRequest, HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
        if (threadPool.getIdleThreads() <= threshold) {
            response.sendError(HttpServletResponse.SC_SERVICE_UNAVAILABLE, "Server too busy");
            baseRequest.setHandled(true);
        } else {
            baseRequest.setHandled(false);
        }
    }

    public static void main(String[] args) throws Exception {
        Server server = new Server(8080);
        QueuedThreadPool threadPool = (QueuedThreadPool) server.getThreadPool();
        int threshold = 10; // Set the threshold as needed

        Handler circuitBreakerHandler = new CircuitBreakerHandler(threadPool, threshold);
        server.setHandler(circuitBreakerHandler);

        server.start();
        server.join();
    }
}

In this example:

CircuitBreakerHandler extends AbstractHandler, a basic implementation of the Handler interface in Jetty.
The constructor takes a QueuedThreadPool instance and a threshold value. The threshold represents the minimum number of idle threads the circuit breaker should activate below.
In the handle method, the handler checks if the number of idle threads is less than or equal to the threshold. If so, it sends a 503 Service Unavailable error. Otherwise, it allows the request to proceed.
In the main method, a Jetty Server is set up with this handler.

Important Notes

Threshold Value: The threshold value should be carefully chosen based on the application's expected load and performance characteristics.
Complementary Strategies: While this circuit breaker can help prevent threadpool exhaustion, it should be used in conjunction with other performance optimization strategies, like query optimization, caching, and load balancing.

Thread starvation in threadpool-based servers like Jetty, IIS, and Tomcat can lead to severe service disruptions. Additionally, in Java-based servers like Jetty and Tomcat, the inability to detect closed connections promptly can lead to additional challenges, especially when dealing with long-running queries. By understanding the potential impact of this issue in different scenarios and implementing effective mitigation strategies, organizations can ensure that their web services remain responsive and reliable, even under high-load conditions. Careful planning of queue sizes and strategic deployment of specialized middleware, such as the circuit breaker middleware shown above, can help protect servers during heavy loads. A proactive approach maintains optimal server performance and ensures a positive user experience.