The tech behind hkrn.ws

I made a URL shortener for Hacker News. The project itself is pretty simple, but the tech stack behind it is pretty interesting. I built it with no code at all, and despite that made it super fast, maintainable, and secure.

Some context

I’m pretty obsessed with the idea of how to simplify building applications. This is the sort of thing that led to me building CircleCI, and I discuss it all the time in my podcast. Right now, building applications has insane complexity.

As an example, let’s take the first version of hkrn.ws that I built. I was in Vegas at New Years, and burnt out from being in Vegas at New Years. So I wrote a thing. It took a few hours total, including buying the domain name, a ton of procrastination, and learning enough Ruby on Rails to do it.

It’s very simple. It turns URLs like hkrn.ws/12303075 into ones like https://news.ycombinator.com/item?id=12303075.

I knocked it up quickly in Rails, using some super simple routes that I’m not even going to post here, and deployed it to Heroku. Unfortunately, there were all sorts of problems.

  • It was slow as hell. I don’t remember how slow, but hundreds of milliseconds to get a response.
  • I used a free dyno (what would today be the hobby plan). Because it wasn’t launched yet, it scaled down all the time. Whenever I went to use it, it would have an extended delay in scaling back up. So it was kinda useless and I didn’t even bother using it myself.
  • If I did scale it up, it would cost real cash money. Hundreds or thousands of people using this in real life would cost a surprising amount of money. And for such a trivial side project, that’s a bit off.
  • No-one was using it now, but if someone started using it, I’d need to scale it up. Heroku doesn’t autoscale.
  • Security: if people used it, then it would be a nice target to redirect people to your ads, penis pills, or whatever else people make dirty money off. So I’d need to keep it up-to-date, follow along with Ruby and Rails security notices, etc.

That’s a lot of work. Maintenance ain’t free. How could I avoid this?

Rewriting hkrn.ws

I rebuilt the entire thing in about 20 minutes using Cloudflare Page Rules. Page Rules allow you do a few specific things, including 302 redirects, for pages matching a pattern.

Cloudflare is a CDN, so there needs to be a site under it, right? Nope — if your rules cover every possible page, then it won’t ever hit the site behind it (though I had to give Cloudflare a fake IP address to even get into the UI, so this is maybe not a use case they envisioned).

The new version of hkrn.ws had no code, and was entirely made up of 11 redirect rules that implemented the following:

  • hkrn.ws/(\d+) -> news.ycombinator.com/item?id=$1
  • hkrn.ws/u/(.*) -> news.ycombinator.com/user?id=$1
  • hkrn.ws/.* -> news.ycombinator.com/$1

This is what it looks like in the Cloudflare UI:

Hacker News ids are all integers, so I needed to match that. In regex, that would be \d+, but as Cloudflare supports globs, not regex, I needed to fake that. Instead, I added a rule starting with each of 1–9. If you combine them, the regex would be [1–9].*

And that was it. No deploying, no hosting, no git push, no bundle install, no emacs, no capistrano, no docker, no nothing.

How well does it solve the problems I had with the first version I wrote?

  • It was slow: Cloudflare is insanely fast. Check out the benchmark below: it times 10,000 requests, with 30 concurrently (it’s hard to do proper load testing given that Cloudflare includes anti-DDOS stuff, so we’ll have to see how this scales).
$ ab -n 10000 -c 30 hkrn.ws/1234567
Requests per second: 2437.00 [#/sec] (mean)
Time per request: 12.310 [ms] (mean)
Time per request: 0.410 [ms] (mean, across all concurrent requests)
Transfer rate: 1139.96 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 3 6 25.6 5 1145
Processing: 3 6 2.7 6 34
Waiting: 3 6 2.6 5 34
Total: 7 12 25.8 11 1152
Percentage of the requests served within a certain time (ms)
50% 11
66% 12
75% 12
80% 13
90% 14
95% 18
98% 24
99% 30
100% 1152 (longest request)

99% of the requests were served in 30ms or less!! Crazy!! It does 2500 requests/sec!! Not the highest ever, but I didn’t write any code to make this work, so it’s pretty awesome.

  • Cost: I’ve been paying cloudflare $20 a month for this, to get enough Page Rules. They just added support for buying just Page Rules, so now I pay $5 a month! $20 a month was expensive for when it was sitting and doing nothing, but super cheap if lots of people were using it. But $5 is nothing!
  • Scaling: scaling is free, both in terms of implementation effort, and in terms of cash money.
  • Security: security is also free. Cloudflare takes care of that, and I don’t need to track anything! I don’t need to update my stack, track Ruby/Go/whatever updates, keep an eye on zero-days, or anything. Cloudflare does it all for me.

Larger apps

So it’s possible to use Cloudflare is you’re building tiny apps that only use redirections. How would you do something bigger?

  • I think you could use Cloudflare as simple routing in front of S3 to build an static API.
  • If you want to support HTTP requests other than GET (I’m building a thing that requires responding to PROPFIND), then you could build rules on a Varnish host in VCL.
  • You can use tools like Blockspring or a combination of Google Sheets and Google App Script.

More general solutions

All in all, to bring this from “hobby project” to “thing I feel comfortable inflicting on the world”, would have required finding a fast and cheap host, keeping up to date on security, figuring out how to autoscale it, and maybe even rewriting it in a fast language like Go or Rust or C.

This is the frustrating part of building modern applications. Something simple is easy. Something simple that’s actually usable on the modern internet is not.

When we build applications today, we spread the code for that application over many machines and often many services. Function calls are no longer simple manipulation of the call stack, but instead involve http,
json, retry logic, load balancing, autoscaling, backoff, quality of
service, a status page, monitoring, analytics, rate limits, etc. This means the accidental complexity from making a simple function call is insane!

The solution, of course, is a complete change in how we build web/distributed applications. I want a tool or a language that allows me to write distributed systems with the ease with which I currently write single-threaded application.

I refer to this high level way of building apps as “Sculpting applications”. This is a toy example, but there are a few tools in the space and I plan on writing more about this. Please contact me (@paulbiggar) if you’re interested!

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.