From Static Web Pages to Real Time Apps

A quick overview of HTTP’s request response protocol compared with WebSockets’ event driven protocol.

The Internet started as a network to send files from one computer to another. As the Internet grew, the web pages that were developed went from static pages to pages that are dynamic. To create a web site that was dynamic, developers needed a way to get updated information from servers. In the early years, people came up with solutions, like polling, to achieve this. But as we will discuss, polling is a very inefficient. Polling was a result of the limitations of the technology at the time.

WebSockets were later introduced in 2011 as a solution to the need for back and forth communication between a client and a server.

Quick History

Before the introduction of WebSockets, developers used polling to update their website with new data. Polling is the process of continuously asking the server to provide new information. There are two types of polling that was used: short polling and long polling.

A client using short polling would send a request to the server at set intervals, like every one second for example. Every time there is a request, the server will send back a response even if there is no new data for the client. It’s akin to a child repeatedly asking a parent if they have arrived at their destination only to receive the same answer, “No” until they finally do arrive. This protocol is taxing to both the client and the server since under HTTP/1.0. The client and server must reestablish the Transmission Control Protocol (TCP) connection for every request. The image below illustrates the process required for a client to retrieve some data from a server. There is a lot of back and forth just for the response to be thrown out if there was no new data.

HTTP request example. A TCP connection must be established before the HTTP request and response can be exchanged.
HTTP request example. A TCP connection must be established before the HTTP request and response can be exchanged.
HTTP request example. A TCP connection must be established before the HTTP request and response can be exchanged. Source: https://blog.cloudflare.com/http3-the-past-present-and-future/

Long polling is similar to short polling except the server will not send back a response right away. Instead, the server will send back a header and then send back the data only when it is available. Once the client receives the response from the server, it will immediately send another request to the server. Using the same short polling example, the child will ask the parent if they arrived at their destination, but this time the parent won’t reply right away. The child will only ask again once the parent has replied to their question. While this is not as taxing as short polling, there is still a lot of overhead with setting up the TCP connection, sending the header data with every request and response, and maintaining the connection while the server is waiting to send the response.

With HTTP/1.1, a “keep alive” functionality was added to the protocol. Once the “keep alive” flag in the HTTP header is set, the clients and servers know to keep the current TCP connection open. Without having to reestablish the TCP connection for each request, there was significantly less overhead. It may seem like we have solved our efficiency issue, but having to request for the data even when it is not ready is very inefficient as well. There’s also the issue of creating, sending, and reading the header information with every request and response.

This is where event driven protocols like WebSockets and server-sent events (SSE) come into play.

WebSockets

In 2011, the WebSockets protocol was standardized. This protocol allowed for full-duplex communication, which is the ability for both the server and the client to send data to each other at the same time. Compared to HTTP’s half-duplex communication where only one party can send data to each other at any given time.

The opening and closing of a WebSocket connection between a client and server. Source: https://www.pubnub.com/blog/websockets-vs-rest-api-understanding-the-difference/

A WebSocket connection will have a slightly longer initial setup time because it needs to created a TCP connection under HTTP with the server, then “upgrade” that connection to use the WebSocket protocol. However, once the connection is open, the connection will remain open until one party closes the connection, during which both parties can send data. The client doesn’t have to send requests and wait for a response, and there is no header information to be send with every message. This eliminates the inefficiencies of using HTTP and allows for responsive real time applications.

The benefits of using WebSockets can be seen in the graph below. The slower start up time of WebSockets, makes HTTP better for smaller number of requests, while WebSockets are much better for applications requiring large number of requests. For a more details breakdown of the performance between HTTP and WebSockets, see the article by @David Luecke.

Benchmark results for 100 concurrent connections and one , 10 and 50 requests each.
Benchmark results for 100 concurrent connections and one , 10 and 50 requests each.
Benchmark results for 100 concurrent connections and one , 10 and 50 requests each. Source: https://blog.feathersjs.com/http-vs-websockets-a-performance-comparison-da2533f13a77

Server-sent events is another protocol that is similar to WebSockets. It allows for a persistent connection to the server, but only the server will send data back to the client after the connection is established. The communication is mono-directional.

Applications like an RSS feed or stock ticker can use server-sent event push the data to clients that are subscribed. However, server-sent events are more limiting than WebSockets. Changes to the application may require us to switch from server-sent events to WebSockets, so it is usually better to start with WebSockets for the extensibility in the future. For a more detailed comparison of the two, refer to this article by Ably.

The WebSockets API itself does not provide many features. Features such as automatic recovery, fallback to polling, or load balancing must be implemented by the developer. Luckily, there are many libraries available due to the popularity of WebSockets. The most popular Javascript library being Socket.io. A comprehensive list can be found here.

There are also paid services that provide you with the API to created your real time application. The main advantage of using a paid service is it is hosted on the server provider’s servers. They will take care of scaling when there are many clients and the clients can connect to a server that is closest to their location for faster and more reliable connections. Three examples of paid libraries are below.

Why Not Use WebSockets For Everything?

While we can use WebSockets for all web applications, there are some points to consider. Using WebSockets is not free. There is more development cost needed at the beginning to set up an application to use WebSockets and keeping connections open uses more resources on a server. Many times a web pages does not need constant updates from a server. As we saw above, for small number of requests it is faster to use HTTP to serve the client.

  • The application needs two way communication between the client and server
  • Real time updates are required
  • Updates from one client should be reflected on another client
  • Just serving static pages
  • Real time update of information is not critical

Resources