HTTP - Hyper Text Transfer Protocol - is behind almost everything we do on the Web and yet it’s something that most people don’t give a second thought to, and that’s OK! If it is doing its job it should be invisible. But as a developer or would be developer it is important and something we should know about, so here I am giving it that second thought. I hope to give an overview of what it is, how it differs from earlier versions, and take a quick look at some places where improvements are still possible.
What is it?
Communication between client computers (namely browsers) and web servers is handled by sending and receiving HTTP requests and responses. To oversimplify it all for a minute — the client/browser sends a request for a HTML page, style sheet, image, data(XML or JSON) etc, the Web Server receives the request, processes the request and sends a response to the browser containing the required information.
HTTP was invented in 1989 by Tim Berners-Lee and his team at CERN, the European Organisation for Nuclear Research, the first documented version was realised in 1991, followed by HTTP/1.0 and HTTP/1.1 in 1996 and 1997. HTTP/2 was published in 2015, the first new version since 1997, so you could say we were due an update! The goals set out by the HTTP working group “httpbis” of the Internet Engineering Task Force were to stay compatible with HTTP 1.1, allow clients to choose between HTTP/1.1, HTTP/2 or others, and perhaps most importantly to improve performance, decrease perceived latency for the end user and improve network and server resource usage efficiency. The initial draft of HTTP/2 was based almost entirely on SPDY, developed by Google.
The major differences introduced in 2.0 were: to allow multiple concurrent exchanges on the same connection; allowing the server to make unsolicited responses; transferring data via binary rather than text; and header field compression. I will provide a quick overview of each of these below.
- Header Field Compression
Header fields define the operating parameters of a HTTP transaction within the header section of request and response messages. Core fields are standardised and additional fields can be defined by each application -initially these fields were quite small but have gradually grown. HTTP/2 compresses the header field resulting in reduced overhead and improved performance. It also improved on the compression technique of SPDY by using HPACK to reduce the risk of compression attacks such as the CRIME attack.
Header field compression reduces the size of transfers leading to increased speed without the increased risk to security that SPDY faced, although the risk has not been totally eradicated.
Unlike HTTP/1.1 HTTP/2 establishes a single TCP connection between the client and the server , and makes multiple requests over this single connection. Each request/response exchange is associated with its own stream.
This is one of the key features of HTTP/2, allowing you to download web files asynchronously and decreases head-of-line blocking which used to occur where requests had to wait for the one before it to be completed before making a new one.
- Text vs Binary
HTML 1.1 keeps all requests and responses in plain text, whereas HTTP/2 uses the binary framing layer to encode all messages into binary - this greatly increases the flexibility of data transfer.
Again, this can lead to increased speed and less data use, as well as being less prone to errors. Importantly for developers the encryption is an added middle layer which means that the data is still being sent and received in the same way so that it doesn’t require any additional work from the developer. Binary encoding also enables flow control and priority, discussed below.
HTTP/2 introduced prioritisation, by allowing developers to assign a weighting to individual requests, a higher number indicating higher priority. The server uses this information to determine which requests to process and in which order. This is especially useful when there is limited capacity.
Prioritisation allows more important requests to be completed more quickly, for example the key elements of a webpage appearing first for the user, improving the appearance of loading speed.
- Flow Control
Flow control is a mechanism to control the flow of data from the sender to the receiver to prevent them from overwhelming the receiver with data they may not want or be able to handle. HTTP/2 gives a more detailed level of control to the client and server allowing them to implement their own flow control.
Much like priority, flow control allows for more critical elements of a webpage to be loaded earlier, ensuring the user sees the content they care most about first.
- An Unsolicited Push
Prior to HTTP/2 a server could only respond to a specific request, so for example the client would have to make 3 separate requests for the HTML, CSS, and an image, and the server would respond to each of these requests in turn. HTTP/2 allows a server to pre-emptively send a response to the client alongside the response of the client-initiated request. If the server knows the browser will require multiple things to render the page it can send it all together without having to wait for the browser to make additional requests for the information.
- There are various ‘real world’ metaphors that highlight this difference, the clearest perhaps is a diner (Client) and waiter (Server). In HTTP/1.1 the diner would have to order their dish and wait for it to arrive before asking a different waiter for cutlery. With HTTP/2 (and Push) the diner orders a dish and the waiter, predicting that they will also need some cutlery, brings both the dish and cutlery to the table together. Who doesn’t like cutlery!
Usage and Browser Compatibility
HTTP/2 is widely supported, as you can see from the diagram below, as of writing the majority of browsers support it. According to w3techs, as of 16th August 2019 HTTP/2 is used by 40.1% of all websites.
End users care about speed, reliability and security (or at least they should), and while HTTP/2 made some improvements on each of these, there are a number of criticisms levelled at the new protocol. It seems many people were hoping for a radical reinvention, and what we received instead was an update.
The main criticisms all relate to security. Firstly the Header Field Compression using HPACK mitigates the risk of compression attacks but does not not completely prevent them. Secondly, many people were hoping that the new protocol would replace cookies entirely, which pose security risks and privacy concerns, instead HTTP/2 made no changes. Lastly and perhaps the greatest criticism relates to encryption. The working group did not reach consensus over mandatory encryption - however according to IETF HTTP FAQ “no browser supports HTTP/2 unencrypted”.
The user has no ability to choose HTTP versions and so don’t really need to be aware of it. However its use behind the scenes improves exactly the things users do care about - speed, data usage, responsiveness and security. To provide these benefits to their users developers need to actively support HTTP/2, therefore we most certainly do need to care about it.