Why does tomcat source code look like this?

Part I, the link between sockets and servlet

4 min readMar 21, 2022

Background

The Servlet interface was introduced as one standard to handle HTTP requests. Let’s take a glance at the definition of the Servlet interface, notice that the service API can only accept one standard ServletRequest and return one ServletResponse respectively. That means the HTTP server needs to transform the TCP socket stream before processing. Since there are a lot of Servlet implementations, how can the HTTP server map different URL requests to the corresponding Servlet? We wish there was such one kind of container whenever we send one ServletRequest it will return a ServletResponse regardless of what the URL looks like.

On the other hand, we need the system to process of clients’ requests as fast as possible.

public interface Servlet {
    void init(ServletConfig config) throws ServletException;
    
    ServletConfig getServletConfig();
    
    void service(ServletRequest req, ServletResponse res）throws ServletException, IOException;
    
    String getServletInfo();
    
    void destroy();
}

Key problems need to be solved

Firstly, the tomcat needs to handle the TCP socket, which means it has to do the following steps.

accept a new client and register this new client socket to the Selector.
copy the client socket buffer file from Linux kernel to tomcat for receiving data.
analysis the HTTP/HTTPS protocol, convert the socket stream into a raw request.

Secondly, the tomcat needs to resolve the URL info of the request, dispatch different hosts, services, and methods to the corresponding servelet to handle.

Thirdly, the tomcat needs to send the response data out.

What needs to take into consideration

There’re some steps when processing the network stream, including accepting a new client, reading/writing data from/to socket, protocol analysising and so on. Meanwhile, the time consuming for processing each step is different. How to design one framework that can use the CPU efficiently, and asign a proper num of threads for each step?

Another aspect we need to conside is that we could have a lot of implementation of the Servelet interface, the tomcat should have a container which will dispatch the client request according to corresponding URL to different service implementation.

Let’s make it step by step

To solve the two key problems described above, we got a prototype tomcat framework.

Note that the service is introduced as a wrapper to hold the Connector and Servlet Container. A service can have multiple connectors, you can think a micro-service application as a Service, each server’s listening port as a Connector.

As you can see from the picture above, to be more specific, the tomcat needs to do the following steps:

listen to one network port, accepts the client requests
parse the byte stream according to different network protocols and general a SerlvetRequest
let the Servlet Container handler the request and return a SerlvetResponse

The servlet container will dynamically load these serivce classes which implement the servlet interface, and it will dispatch the request to the corresponding service class to process the request.

The tomcat uses the different components to do relevant steps, which can be generally described as follows.

Endpoint to receive and send client’s socket, handle the TCP/IP protocol.
Processor to process HTTP protocol, receive sockets from Endpoint, generate a coyote Request.

Instead of generating a SerlvetRequest directly here, tomocat use a Adapter to decouple the Connector and Container. Beautiful Design!

Adapter to build one bridge over the Connector and Servlet Container, convert coyote request into a standardized SerlvetRequest.

Now the framework becomes something that looks like this, notices that we combine the Endpoint and Processor together and put them inside the ProtocolHandler.

If you were familiar with the Linux kernel, you should know the epoll model when processing the sockets. As we all know, the server needs to register the client in the selector when accepting a new client. However, there’s a processing speed gap between a new client registering and a registered client data sockets sending or receiving. What’s worse, protocol processing will take more time compared to socket transferring.

Since the NIO technology has already become the most popular one when processing network sockets, let’s zoom in the Endpoint component and see what it looks like.

The NioEndPoint mainly contains 5 components, namely LimitLatch, Acceptor, Poller, SocketProcessor and Executor.

LimitLatch is used as a threadhold for the max connections
Acceptor which runs in one single thread to accept new client connections, and it will generate a Channel instance which will be transfered to Poller.
Poller is essentially one Selector, which maintains a Channel array called Poller event in the source code. The poller will continue checking the available Channel in the array and generating a new SocketProcessor instance which will be throw into the Excutor thread poll to process.
SocketProcessor is a payload combines the socket and processor, the SocketProcessor will invoke the process method inside Processor.
Executor is a tomcat customized thread pool, which is in charge of executing the runnable SocketProcessors.

Conclusion

One important rule in computer system design is decoupled, almost nothing can’t be decoupled by adding another middleware. Another basic rule is to use the CPU as much as possible, which means less process switching and split components containing a speed gap. When dealing with these services or components with a processing speed gap inside your application, don’t hesitate to use different num of threads. Using a queue to bridge two services or threads is another basic way in multi-thread design pattern.