source unknown

How to Build a Cache in JavaScript from the Ground Up

Paul Heintzelman
Paul Heintzelman
Published in
5 min readSep 22, 2019

--

A cache is a place to store data that you want to access more quickly in the future. It is a key component to writing performant software.

Some Design

Photo by Kelly Sikkema on Unsplash

JavaScript modules are singletons. Which means any data you store in a JavaScript module for later use will be held in memory.

//with no cache
function catCategorizer(cat) {
const catTaxonomy = complexCatAnalysis();
return catTaxonomy;
}
//with cache
const catTaxonomies = {};
function catCategorizer(cat) {
let catTaxonomy = catTaxonomies[cat.id];
if(!catTaxonomy) {
catTaxonomy = complexCatAnalysis();
catTaxonomies[cat.id] = catTaxonomy;
}

return catTaxonomy
}

Blog post done right?! The above works great if you only have one or two things you want to cache. But you’ll soon find little bits of state floating all about your application. Besides, if we centralize this logic we can add some other nice features.

Let’s start by building a basic cache and then we can add some other features.

Building a Basic Cache

Photo by Harprit Bola on Unsplash

At a bare minimum a cache should have a way to get and set data. Let’s keep it simple and call the methods ‘get’, and ‘set’.

That’s the whole module. A cache in 13 lines of code. Using this in the previous example.

const {get, set} = require('./basicCache');function catCategorizer(cat) {
let catTaxonomy = get(cat.id);
if(!catTaxonomy) {
catTaxonomy = complexCatAnalysis();
set(cat.id, catTaxonomy);
}

return catTaxonomy
}

Why we are not going to add has

has seems like a good friend of get. Shouldn’t we check to see if we have the value before fetching it?

//unsafe code
if(has(key)) {
get(key)
...
}

The above is not safe (time, thread or async). It is possible that has(key) returns true but get(key) returns undefined. This is because our state could change in between the two calls. It’s always safer to get the value first.

//safer code
const value = get(key);
if(value) {
...
}

Note: generally, state in JavaScript doesn’t change between one line of code and the next, but the above could be asynchronous code instead and the time will change. The fact that has and the get execute at different times will be problematic once we introduce a timeout.

Because of this we are not going to add has or contains to our cache. But we are going to add plenty of other features. Next we will add the concept of namespaces to prevent key collisions.

Supporting Namespaces

Photo by Jonas Thijs on Unsplash

With the basic version of our cache, keys could collide. To prevent problems with the same key being used in two different places we will add the ability to set and use a namespace. We could add a namespace argument to our methods but there is a more elegant solution using closures.

Using this with our example.

const {useNamespace} = require('./basicCacheWithNameSpacing');
const catCacheKey = Symbol('catCacheKey');
const {get, set} = useNamespace(catCacheKey);
function catCategorizer(cat) {
let catTaxonomy = get(cat.id);
if(!catTaxonomy) {
catTaxonomy = complexCatAnalysis();
set(cat.id, catTaxonomy);
}

return catTaxonomy
}

Only the first three lines changed! Now it won’t cause issues if cat.id is used somewhere else. I am using a symbol to guarantee uniqueness. Symbols make great cache keys.

Next let’s switch to using map and add remove and removeAll.

Adding remove and removeAll

Photo by The Honest Company on Unsplash

It would be nice to be able to remove values from the cache. To make this easier I will take advantage of the new JavaScript Map type. It has methods for easily removing values and has some other handy features. For example it will let us support more complex keys in our cache. For more information about the differences between Maps and Objects see here.

Our cache is coming along. The next thing we will add is a timeout.

Adding Cache Expiration

Photo by Lukas Blazek on Unsplash

Having the ability to expire data in our cache is very important. Data in a cache gets stale. Having the data expire is a good way to balance the speed improvements from caching with how stale our data gets. In many cases a small timeout still makes things fast, giving the best of both worlds.

I have found the most useful way to implement a cache timeout is per key. That way we have the most flexibility.

We could add a timer that cleans the cache periodically but it is more effective to just store the expiration time and handle the expired data like a missing key. I also find it handy to add a method timeTillExpires(key) which will return how much time is left until the key expires. This is useful for knowing when cached values like settings will refresh.

Since our cache is getting a little more complicated I am also going to clean up a few things.

Conclusion

We built a full featured cache module using modern Javascript. This blog post is primarily designed to give you a deep understanding of how a cache module works, and what features to look for.

I recommended using a cache module from npm (although I don’t have a specific one to recommend).

If you do want to build your own, feel free to use the code from this post.

If you want to use this code for a blog post or talk please ask permission.

--

--