Real-time Communication

How do application developers display data in real or near-real time? Recently, my classmates and I completed an assignment for our Cloud Computing and Big Data course in which we were tasked to filter the Twitter Streaming API for certain keywords and visualize the statuses on a map in real time (this was meant to be an exercise in deployment on AWS Elastic Beanstalk). This video shows a demo of the app I built.

The assignment was a fun way to learn about various client-server communication mechanisms; I had previously heard of a few technologies that enable real-time messaging, but didn’t understand their internals. In this post, I’ll share an overview of Server-Sent Events (SSE) and WebSockets.

To motivate the use of SSE and WebSockets, let’s examine methods that have previously been used to update data on the client:

  • short polling: The client sends requests in consistent intervals. Each time the server responds with no new data, this results in unnecessary request/response overhead. This HTTP overhead is not suitable for developing low latency applications (i.e. any application with a real time component).
  • long polling: The client sends an AJAX request and its connection with the server stays open for a while; if the server has the requested data ready during this period, it sends that in its response to the client. The client’s AJAX callback repeats this process. This tactic decreases network traffic but still requires the client to initiate each request for the data.

Fortunately, two technologies have evolved — SSE and WebSockets — that don’t rely on the client to continuously initiate requests for additional data. As a result, they achieve lower latency, which is the defining spec for real-time applications.

Server-Sent Events

The model in which client-side events are dispatched to the server (i.e. a user clicks on a link to request a new page from the server) can be referred to as “client-sent events”. Server-sent events are a communication standard that allows servers to send information to the client as it becomes available, without client polling. Typical use cases include applications in which the client listens for messages but does not need to push any: friends’ status updates, stock tickers, news feeds.

SSE is implemented over a single, persistent HTTP connection: servers can define a reconnection timeout after which browsers will automatically reconnect, if the connection has been closed. To use SSE, an application implements an EventSource API in the browser and pushes data from the server with SSE’s event stream data format.

Client-side Implementation

Server-sent events are an HTML5 specification: an API for receiving push notifications from a server is defined via the EventSource Javascript object. Using this API client-side consists of creating an EventSource object and registering an event listener.

// The EventSource constructor’s argument 
// is the URL of the script sending updates.
// If the URL passed to the EventSource constructor
// is an absolute URL, its origin (scheme, domain, port)
// must match that of the calling page.
var source = new EventSource(‘stream.php’);
// registers an event handler that fires
// on messages without an explicit type
source.onmessage = function (event) {
alert(event.data);
};

A server can push different event types and a client receiving messages from that stream can register a different handler for each of the types:

var source = new EventSource(“stream.php”);
// subscribe to event of type “foo”
source.addEventListener(“foo”, fooHandler, false);
// subscribe to event of type “bar”
source.addEventListener(“bar”, barHandler, false);

Event streams are always decoded as UTF-8.

List of browsers that support SSE: http://caniuse.com/#feat=eventsource
There are polyfill libraries that fallback to polling, long-polling or XHR streaming when the browser does not implement the EventSource interface.

Server-side Implementation

Sending an event stream from the source is a matter of constructing a plaintext response, served with a text/event-stream Content-Type, that follows the SSE format. You can read about the SSE message format and how to use it to send JSON here.

WebSocket

SSE does not allow the client to push messages to the server, but WebSocket does. WebSocket enables bidirectional, real-time messaging, which makes it well-suited for applications like chat and multi-player games.

WebSocket constitutes:

  • a protocol (defined by the Internet Engineering Task Force) to enable asynchronous two-way communication between a client and a remote server without opening multiple HTTP connections (as is done with long polling). It accomplishes this with a single, long-lived TCP connection.
  • an API defined in W3C’s HTML5 specification. It constitutes the WebSocket object which has certain attributes (including `readyState`) and methods (`send`, `close`).

Both the web browser and web server must implement the WebSocket interface in order to establish and maintain this long-lived connection.

Client-side Implementation

Clients don’t need to use a framework to access the WebSocket interface because web browsers that implement the WebSocket API expose all client-side functionality through the WebSockets object, which can be accessed by the client in raw Javascript. Developers who don’t use a WebSocket framework (such as the atmosphere.js jQuery plugin) should write code to fall back to HTTP if the browser doesn’t support WebSocket and to handle differences in browser implementations.

List of web browsers that implement the WebSocket API: http://caniuse.com/#feat=websockets

Server-side Implementation

An important consideration in a server-side implementation is that a WebSocket is long-lived, unlike typical HTTP connections. Since multithreaded/multiprocess servers are designed to open a connection, handle a request as quickly as possible, and then close the connection, the WebSocket API cannot be implemented on these types of servers. An asynchronous server is necessary for WebSockets server-side implementation.

The long-lived connection starts with a handshake that upgrades the connection from HTTP to the WebSocket protocol. The client must initiate the WebSocket handshake process by sending an HTTP GET request with HTTP version 1.1 or greater and the following headers, including a WebSocket key of random bytes:

Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: foo
Sec-WebSocket-Version: bar

The server should listen for incoming socket connections using a standard TCP socket. It can listen to any port but if it chooses a port other than 80 or 443, it may have problems with firewalls and/or proxies. Connections on port 443 require a secure connection (TLS/SSL). The server should respond to the connection request with an HTTP response that looks like:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: baz

The Sec-WebSocket-Accept value is generated from the Sec-WebSocket-Key that the client sent by concatenating the client’s Sec-WebSocket-Key and “258EAFA5-E914–47DA-95CA-C5AB0DC85B11” together, taking the SHA-1 hash of the result, and returning the base64 encoding of the hash. After the client receives these headers, the handshake is complete, the connection has switched to the WebSocket protocol, and either side can start pushing data.

Although WebSocket is layered over TCP, it is not a stream-based protocol. Both clients and servers have to send each chunk of data in a “frame” or across several frames. This article and this one describe how the server should format the data in a frame and extract the payload from the client’s frames.

The close opcode 0x08 is used to terminate the connection: the client and server exchange close opcodes and the server sends a CloseEvent to clients, which are listening with their WebSocket object’s `onclose` handler.

Some server-side implementations:

  • Atmosphere Framework ships with the jQuery plugin and several server components supporting all major Java-based WebServers, allowing a developer to write applications using WebSockets in Groovy, Scala and Java
  • Autobahn|Python: open source, Python implemention of the WebSocket protocol, running on Twisted
  • ws: WebSocket library for node.js that implements websocket client and server
  • socket.io provides 1) a server-side solution (Node.js module) and 2) a cross-browser wrapper to transparently use the best transport for client-server communication that’s available from your browser, which can be the HTML WebSocket API or a fallback transport (Flash Socket, AJAX long-polling, AJAX multipart streaming, IFrame, JSONP polling). This socket.io component is built on top of ws, which implements the WebSocket transport option.

This article compares several of the WebSocket implementations that have been developed for the node.js server environment.

My Application

I decided to develop my application with WebSocket using node.js, socket.io and Express.js. WebSocket may have been overkill because SSE would have sufficed for handling real-time data in the browser; however, the hype around socket.io/node.js made me curious to learn about them and I thought using these technologies for the assignment would kill two birds with one stone. I did not necessarily use the right tool for the job but I did satiate my curiosity about a framework and tried a new (JS) server-side environment. The code is here if you’re curious: https://github.com/six5532one/realtime-tweets-map

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.