Redis Static Charge

Kyle
5 min readFeb 16, 2016

--

This is part 20 of my Node / Redis series. The previous part was Node / Redis Bloom filters. Additionally, you might want to read Redis, Express and Streaming with Node.js and Classic Literature(?) before you tackle this story.

Is it worth it to throw all your static files into Redis and serve’em up?

I’m working on an Express-based app at the moment that has a number of large text files that are being served, without modification, to the user. I’m using Express’ static middleware to send the files now and the performance is good, but could it be better?

My first thought is (predictably) Redis. In-memory = good / fast. File System = bad / slow. I want something that I can just throw some content into Redis and have it served out properly — this means that you’ll need to manage the mime-type correctly as well as other HTTP response headers.

For my experimental setup, I’m going to use my development laptop (which has an SSD drive) over localhost— this removes any network issues. As my experimental data, I’m going to use something big-ish — I’ll use a Project Gutenberg text file version of a A Tale of Two Cities. It’s about 775kb, but not unrealistic for the content I’m curious about.

First up, I built a very small Express router. The router can be ‘mounted’ at a point and it will receive routes as if it were starting at ‘/’. My initial version examines the URL to get the correct mime-type, takes the route, prepends on our namespacing key and does a Redis GET to fetch the content, that is then sent back to the client.

The same content is also being served with the express.static built-in middleware.

Doing some in-browser tests by just hard-reloading the page, I’m seeing that express.static is winning fairly consistently. My first thought is that I need to take care of my headers a bit better. Examining the headers, I’m noticing that express.static is adding in the content size through the content-length header. Doing some google sleuthing, I can see that this can have a performance impact on how the client juggles the packets, which makes sense. To get the content length, I’m just using the length of the returned value.

Doing the casual tests, I’m not really seeing any difference. Redis is still slower, although a tad faster than the first version. Looking a little closer at the DevTools, I can see that the actual transmission time for Redis is marginally faster than for express.static, but any gains are being eaten by the waiting time.

Redis, note the TTFB (Time to first byte) vs the actual download speed.
express.static, almost no TTFB, but a little longer content download

This makes sense — with this method here is the [simplified] order of operations:

  • Get Request
  • Redis Request
  • Get Data from Redis, wait until entire value is received
  • Count the number of bytes
  • Send the headers
  • Send the content

I wasn’t entirely sure how quick Node.js could get the length of a very, very large string, so I did a quick intermediary version that got the length from Redis with STRLEN. In the end, it really made no difference if you counted the length with String.length or STRLEN.

This looks like a classic streaming vs buffering problem. Moving 775kb around is no small task, creating the TTFB bottleneck.

In the current state of Node and Redis, streaming is still not 100%. I’d love to be able to open a stream and just pipe a Redis response back to the user, but this isn’t in the cards as of right now. I’ve worked on the buffering problem with Redis before, so let’s try the GETRANGE pseudo-streaming option. I’ll use the STRLEN (just an O(1) operation) To get the total length and then use streams to get the value. This solved the waiting issue, nearly zero wait time was had with this method, unfortunately, pseudo-streaming isn’t quite as efficient as real streaming and the speed didn’t change much.

Finally, I ran it through the paces with the same type of experimental setup I used to measure pseudo-streaming vs buffering.

So, where does that leave us? Weirdly, the file system-based module is faster than our in-memory solution. A totally unexpected result and it seems to be faster by quite a bit, which I can’t yet explain fully. I did notice that in all cases the initial response is always slower — I restarted the server process between each time and let it sit for a few seconds to make sure it had established the Redis connection and wasn’t busy loading any modules. The initial load time difference was larger for Redis, for some reason. I can’t prove this, but I’m guessing there is some sort of OS-level optimization or caching going on here, but it doesn’t account for the difference between the methods.

After examining the source code for the static part of Express, I think it comes down to real vs pseudo-streaming and network round-trips. As I suspected express.static (which largely just wraps send) is very streaming-centric. It’s all pipes and there is no socket connection nor round trip that needs to be managed as in the Redis version. On top of this, large values have been something that Node Redis has struggled with and I’m not sure they are entirely resolved, given these results.

So, where does that leave us? In my mind, (TL;DR) leave your static files as static files, especially if they are large. Now, the formula would change if you’re trying to distribute the files among many servers, this is where Redis could shine in comparison to, say, piping S3 — you could have a Redis instance with a master/slave setup that could source your static files rather than have them residing with each copy of your app on multiple VMs or machines.

Here are my results:

--

--

Kyle

Developer of things. Node.js + all the frontend jazz. Also, not from Stockholm, don’t do UX. Long story.