Looking Under the Hood: HTTP Over TCP Sockets

Published in

The Startup

5 min readSep 10, 2020

In software engineering we love abstractions. They take care of the tedious details and allow us to put our attention where it belongs. However, there is value in understanding how they do what they do (take this advice from Joel). Following this guidance, I decided to tackle an embarrassingly long-standing gap in my knowledge; translating tcp packets into HTTP responses.

This post goes through the steps of writing an HTTP client similar to curl. The steps are;

Creating a socket
Establishing a tcp connection
Sending an http request
Reading the http response

The code examples in this post are in C. The purpose of this choice is to be as close as possible to the system calls the kernel provides.

The Boilerplate

Now that we are under the hood, we are exposed to the overhead of establishing a tcp connection. First we need to create a socket. Then use that to initiate a tcp handshake.

int sockfd = socket(AF_INET, SOCK_STREAM, 0);

The example above uses the socket constructor from the C standard library. The type `SOCK_STREAM` reflects the stream oriented nature of the tcp protocol. This will become relevant later.

A web server is a machine that waits for clients to connect to it. We do that with the connect command. This command tells the kernel to initiate the handshake mentioned above. Handshake is a costly process. Hence http clients usually provide ways to optimize for it. Our naive implementation will not do that.

int sockfd, portno; // port is 80
struct sockaddr_in serv_addr;
struct hostent *server;server = gethostbyname(“www.wikipedia.com");
if (server == NULL) {
  fprintf(stderr,”ERROR, no such host\n”);
  exit(0);
}bcopy((char *)server->h_addr,
(char *)&serv_addr.sin_addr.s_addr,
server->h_length);
serv_addr.sin_port = htons(portno);if (connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0)
  error(“ERROR connecting”);

What happens above is that; we resolve the host name to an address, we use that and the port number to create a socket address, we use the socket address to connect to the host. If the C language constructs are not familiar to you, do not worry. It is unlikely you’ll ever need to learn them. If you are a helpless curious, here are all the details you wish you didn’t ask for.

The Ask

We just overcame the tcp entry barrier. We are ready for the first request. We will initiate the conversation with the most well known pick up line in the http playbook. The `GET /` request. Every server falls for that.

char get_req[] = “GET / HTTP/1.1\r\n\r\n”;
int byte_count = write(sockfd, get_req, strlen(get_req));
if (byte_count < 0) error(“ERROR writing to socket”);

Onto the fun part! Let’s start reading RFC 2616 to understand what is going on. The request string follows the structure identified in section 5. It doesn’t have any headers, which also indicates no message body. Hence the double `\r\n` (CRLF) marks the end of the request.

Next up is the write command. As well as the socket and the request string, this command expects the length of the input. In high level languages this is done for you (see python’s send method). In C, the command signature reflects the system call. This is why we are writing in C.

The Return

It is time to receive our very first http response. This is where the stream oriented nature of tcp makes things interesting.

In a data stream there is no notion of an individual message. Tcp does not provide boundaries between one chunk of data and another. Hence much like the write command, the read command as well expects a length (count) argument.

The count argument tells the kernel how much of the incoming byte stream the user wants to read. The kernel reads this much data from the beginning of the stream. It removes the data from the stream and returns it to the caller. The next read call repeats this operation, starting from the new beginning.

One common scenario is that the amount of data we are asking for has not arrived to the socket yet. In that case the kernel returns as much data as the socket has. It also returns the length of the data it read from the stream. It is up to the caller to handle the partial responses. One approach is to poll the socket until the data is available (or a timeout has been reached).

All right, let’s get back to our http response.

Welcome HTTP

The challenge is that http messages have varying length. In order to understand how much we should read from the stream, we should inspect the http specification. Section 6 defines the http response structure as;

https://tools.ietf.org/html/rfc2616#section-6

In plain speak, http response has a status line, a headers section (might be empty) and a response body (optional). The headers section is separated from the message body by the same CRLF delimiter we used in the request. Also the status line and each header end with the same delimiter.

Message body does not end with a delimiter. Instead http protocol provides alternative ways to inform the client. One of them is the content-length header. Section 4–4 explains other methods.

Ok, armed with this knowledge we can come up with a game plan for the http responses that include a content length header;

Read from the socket until the first delimiter, this is the status line
Read from the socket until the next delimiter, this is the first header
Repeat the last step until we come across two delimiters back to back, this is the end of the headers section
Read the content length from the headers, read as much data as length suggests from the socket, this is the message body
Mission accomplished, we read the exact whole message, not a byte less or more

One last hurdle to tackle is reading the socket until a certain delimiter. Read command doesn’t provide such a functionality. Hence we have to read a fixed sized chunk. Search the data for the delimiter. If not found, load another chunk. At some point we will receive the delimiter.

Once the delimiter is found, we can process all the data up to that point. The implementation should keep a reference of the data that came after the delimiter. This data constitutes the beginning of the next structure in the response.

Below is a simplified implementation;

void parse_headers(int sockfd, char *buffer, char *content_length) {
  bool reached_message_body = false;  while(!reached_message_body){
    // append CHUNK_SIZE data to the buffer
    load_buffer(sockfd, buffer, CHUNK_SIZE);    // look for the delimiter index in the buffer,
    // if not found return -1
    int next_line_start = find_next_line(buffer);    // loop while there is a delimiter and 
    // we have not reached to the message body yet
    while(next_line_start != -1 && !reached_message_body) {      log_header(buffer, next_line_start);      // keep a reference of the content length header 
      // for reading the message body
      copy_if_content_length(buffer, next_line_start,
        content_length);      // remove the parsed headers from the buffer
      release_line(buffer, next_line_start);      // check if we reached the start of the message body
      // this looks for two delimiters back to back
      reached_message_body = is_message_body_start(buffer);      // we might have read multiple headers in one read.
      // before we read more data check if 
      // there are any other headers we can consume
      if(!reached_message_body) {
        next_line_start = find_next_line(buffer);
      }
    }
  }
}

Rest of the implementation is available in github at grandbora/knowledge_gap . It handles only the http responses that have content length header. Extending the functionality to cover chunked http responses is left as an exercise to the user. You are welcome.

Bye.

Looking Under the Hood: HTTP Over TCP Sockets

The Boilerplate

The Ask

The Return

Welcome HTTP

Written by Bora Tunca