HTTPS is hurting users far away from your servers, and what to do about it
Lately, we’ve been told HTTPS is finally faster than HTTP. Mostly because of HTTP/2's multiplexing. In this post, I want to highlight an issue we rarely think about with HTTPS: the slowdown it implies for users far away from your servers. Since we’re usually on the same continent as our servers, we never get to think about this.
To perform an HTTPS handshake, a user will need to get the key of the server before sending his request. This requires an extract roundtrip for each new connection compared to sending the request right away and receiving the response directly. Once the first request is done, the connection is then kept open for some time until the user isn’t active anymore.
When is this an issue?
If most of your users are on the same continent as your server, you’re mostly fine. A few dozens of milliseconds will be added to the first request compared to a regular HTTP connection, but it remains reasonable.
What if your service isn’t limited to one geographical zone? What if your users are scattered around the globe and come on your website a few times a day? A roundtrip from Berlin (where I live) to AWS Sydney’s data center takes roughly 345ms. Doing one is painful enough, but with HTTPS, we now have to do 2 of them. This problem is very real.
In our case at Hunter, users were waiting on average 270ms for the handshake to be finished. Considering requests are handled in about 60ms on average, this was clearly too much.
What solutions are available to us?
HTTPS isn’t perfect, but it has become a requirement that we obviously can’t disable. What can we do?
Solution n°1: 0-RTT Handshakes in TLS 1.3
TLS 1.3 isn’t available yet but the teams of Chrome and Firefox are working on it. One of the proposals of the specification is to allow visitors to directly encrypt and send their request without performing the usual full handshake. To make this possible, the client uses a key shared beforehand by the server.
(This image is from Tim Taubert, a security engineer at Mozilla. He has written a great article on TLS 1.3 that you really should read if you’re interested in the subject)
Zero-RTT handshakes still have two problems:
- Users need to have visited your website at least once. And the first connections will be as slow as it is today.
- I’m confident APIs server-side will take quite a while before they benefit from this. In our case, that’s a problem as having a fast API is one of our selling points.
Solution n°2: Cloudflare’s Railgun
I suppose praising Cloudflare isn’t a great move right now, but we were really surprised by Railgun.
The idea behind Railgun is to keep a connection opened between Cloudflare’s edge servers and yours. And while they’re at it, they delta-compress pages to send less data over the network.
Cloudflare has datacenters all over the world ensuring a quick handshake with your users. At the same time, they keep an opened connection with your origin server. Once the connection is opened, it will be used for multiple users.
We noticed great results after adding Railgun:
(You’ll notice the graphs are a missing about an hour. This is because as soon as we activated Railgun, we experienced an unrelated downtime. To make them easier to read, I removed this hour from the graphs)
The main downside of this method is that it’s completely proprietary. You’ll also need to be a Cloudflare customer and pay for their business plan ($200 / month / site). You’ll then need to install Railgun’s listener on your load balancer’s server (super easy to do, though).
After using Railgun for a few months, I would highly recommend it if you’re already a customer and experience this issue. It has never required any maintenance on our end and the results are undeniable for users far away from us.
Conclusion
Adding Railgun has helped us limit the issue for now. I would be curious to discover any alternate solution you have. Also, please note I’m far from being an expert in HTTPS & TLS, so please let me know if I made a mistake in the article.