WebSockets, HTTP/2, and SSE
A Story of Love, Loss, and Resurrection


My interest in all things technical security has fuelled my career for years. I love the work, despite the many frustrations that can come along with it. The greatest challenge I face daily is that technology advances faster than any standards track could possibly hope to keep up with. I write about my battles as a cathartic outlet — an intent with a two-pronged purpose: I can share my experiences so others can avoid running headfirst into brick walls, and I refrain from hitting my own against one. Today’s battle story covers the adventures of Websockets (and friends).
The tale of Websockets
The year was 2011, well into the life of HTTP/1.1. It was four years before HTTP/2 would reach standard, and the world began to demand increasingly more of communications between web applications and backend services. HTTP/1.1 had been the reliable workhorse everyone loved, but some of it’s limitations were starting to become apparent.
HTTP/1.1 is fundamentally a stateless request-response protocol. For every piece of data that needs to be transmitted, the clients have to initiate a connection to the server, introduce themselves, authenticate if needed, and request the resource. The server returns the response, and then immediately forgets the encounter ever occurred. This is not terrible if the resource you’re loading is web pages, but if you’re waiting for the server to finish a long running action, keep you informed about a stream of updates, or need to send repeated fast requests then this one-shot request pattern becomes extremely frustrating. It limits speed, brings with it unneeded data overhead, and limits the kinds of patterns you can even consider in designing your applications.
No one was willing to go after the root cause of our misery, yet something had to be done — an extension was needed.
Enter everyone’s favourite RFC 6455: The WebSocket Protocol. Neatly sidestepping the issue of HTTP being a half-duplex, WebSockets hijacked an established HTTP connection. The protocol proceeded to ignore the rest of the standard, setting its own rules for the rest of the session. A bi-directional pipe between client and server, properly standardized and widely supported, WebSockets became extremely widely used, and grew up alongside the webapp explosion of today’s world. WebSockets made receiving streaming updates from the server a breeze, and since the connection was always there idling for you, repeated requests didn’t have any of the setup overhead as a full HTTP request.
Fast-forward to four years later — even with all the improvements the new tool had brought we could no longer stand it. WebSockets covered a use case for pages that were already loaded, but moving your average content over the web was still a royal pain. We were all driven by sprite sheets, domain sharding, and “best practices”, all of which were thinly-veiled illusions for the deep madness that was the battle against networks. We finally snapped, and sat down to draft HTTP/2.
A shining beacon in the dark world of networks, HTTP/2 promised to finally fix several of the problems that had beaten us into submission for years. It is 2018, adoption is ubiquitous in browsers, and are working to shift our applications to support and optimize for HTTP/2 everywhere.
In Go, this is quite straightforward since HTTP/2 has been supported transparently (and by default) since the 1.6 release. However, our protagonist is notably absent: the WebSocket.
At the time of writing, all our favourite WebSocket libraries break when facing an HTTP/2 connection. Most Go WebSocket libraries depend on the http.Hijacker interface, and Go doesn’t support this connection. More generally, a WebSocket expects to be able to make complete use of the TCP connection after stealing it from HTTP. However, HTTP/2 expects to continue to exert control and multiplexing over its very precious TCP connection. The two are currently mutually exclusive.
Side note: There was a draft proposal to support WebSockets over HTTP/2 but received very little love from the IETF, and, to my knowledge, the endeavour has not been pursued.
With much dismay, it appears we’ve lost our WebSockets. This can’t possibly stand — surely it is a regression in the very core of the web!
Let’s take a closer look at the situation. It’s become instinct to reach for a WebSocket as a solution — so much so that the original problems are forgotten, as is the understanding that the world is a little different now. Do our old problem assumptions still hold up?
Part of the reason WebSockets were so useful is because sending each thing as its own request to the server was expensive. When we included all the auth headers, the data size was non-negligible, as was the connection overhead. When we use them today, the connection for a request is already established, and thanks to header compression and de-duplication, there is much less data being sent in a basic request.
WebSockets allowed us to hold a persistent connection open to the server, but now so does HTTP/2. However, using WebSockets allowed the server to push data over established connections whenever it needed, rather than wait for a polling request to respond to — this was especially important.
(It might seem tempting to say this issue can be solved with Server Push which is part of the HTTP/2 spec, but push is about loading entire resources, not bits of application data.) This is the kicker, and to solve it we need to bring up an older and less loved old friend.
Along comes Server Sent Events (SSE)
Server sent events are actually a part of the HTML standard, not HTTP. They define functionality that is entirely invisible to the HTTP layer, and don’t fight or disrupt any of the lower layers. Unfortunately, this comes at the cost of less pervasive implementation by major browsers. Most browsers that anyone takes seriously support SSE, and the one straggler has had its feature request under review since 2014. Sources tell me that a polyfill exists and works quite well to get around this.
At it’s core, SSE is just a Content-Type header that alerts the browser that this response will be delivered in pieces. It also alerts the browser that it should expose each piece to the code as it arrives, and not wait for the full request, much like WebSocket’s frames. This is provided with the very easy-to-use EventSource interface in client side code:
var source = new EventSource('updates/live');
source.addEventListener('message', callback, false);Server side, the SSE protocol is entirely text-based, so it’s easy to write your own server implementation. I can say this confidently because I’m speaking from experience. The library is available on GitHub.
The server side code feels very much like some WebSocket libraries, and behaves quite intuitively. Set up is trivially simple.
func main() {
stream := eventsource.NewStream()
go func(s *eventsource.Stream) {
for {
time.Sleep(time.Second)
stream.
Broadcast(eventsource.DataEvent("tick"))
}
}(stream)
http.ListenAndServe(":8080", stream)
}Any client connecting to this server using an EventSource object gets a stream of events. In this case, every second, it receives an event of the default type “message” with the data set to “tick”.
That’s it. Seems simple doesn’t it? Too simple…
There’s a catch: it’s a one way street. After the request, the server is the only one that can push more data down the pipe. However, with requests in HTTP/2 being much cheaper, we didn’t have much of a problem migrating the few things we had that were truly bi-directional into a couple of pub-sub channels instead. We found that in most cases we were using WebSockets for pub-sub situations anyways. For this use case, SSE shines as the winner.
A PoC joins SSE
In trying to suss out more limitations or hidden gotchas, I decided we should make a quick little chat service. The resulting app was a browser-based chat room in less than 250 lines of server code, routing and parsing included, and less than a hundred on the front end.
The server accepted new chat messages on a very simple REST endpoint. The code to broadcast it to connected clients boils down to:
// Get the message from the request however you see fit
msg := parseMessage(req.Body)
// Create the message event
ev := &eventsource.Event{}
ev.Type("newmsg")
ev.ID(fmt.Sprintf("%d", msg.ID))
ev.Data(msg.Text) // Send it to all clients c.stream.Publish(chatID, ev)
// return success
w.WriteHeader(http.StatusNoContent)
On the frontend, the javascript was equally as easy:
msgSource = new EventSource(`${config.api}/chat/live`);// Attach listeners
msgSource.addEventListener('newmsg', processMessage);
// Render new messages as they arrive
function processMessage(e) {
let id = parseInt(e.lastEventId, 10);
if (id > lastMsg) {
console.log(`New message: ${e.lastEventId}`);
renderMsg(e);
lastMsg = id;
}
}
JS is not my first language, but even I could cobble that together and have a functional prototype that I wasn’t outright ashamed of — in less than a half hour of work. If nothing else, SSE has proved itself extremely easy to use.
To be continued…
This wild ride is far from over. Time will tell if the last of the browsers will implement SSE, or if WebSockets will gain support directly in HTTP/2. Perhaps something new will rise to fill a gap that I don’t even know about yet. We will need to look at the situation with fresh eyes — a challenge at best. Blindly trying to jam old tech into a new problem will only lead to less than ideal solutions. On the flip side, abandoning widely adopted standards is perhaps scarier still.
I would encourage every team to explore SSE. Perhaps it doesn’t completely cover your use case, or maybe it’s the missing piece of your HTTP/2 compliant stack you’ve been waiting for. Either way, it’s fun and lightweight, and as we start seriously considering parting ways with our dear old friend WebSockets, it’s time to start calling up older friends to see what they have to offer.
Andrew spends his time hand-routing artisanal packets through obscure networks you’ve never heard of for Axiom Zen.