How does SW-Precache works

Xuan Huang
Mar 28, 2017 · 7 min read

SW-Precache is a great Service Worker tool from Google. It is a node module designed to be integrated into your build process and to generate a service worker for you. Though you can use sw-precache out of the box, you might still wonder what happens under the hood. There you go, this article is written for you!

Overview

The core files involving in sw-precache are mainly three:

service-worker.tmpl
lib/
├ sw-precache.js
└ functions.js

is the main entry of the module. It reads the configuration, processes parameters, populates the template and writes the result into specified file. And is just a module containing bunch of external functions which would be all injected into the generated service worker file as helpers.

Since the end effect of sw-precache is performed by the generated service worker file in the runtime, a easy way to get an idea of what happens is by checking out source code inside . It’s not hard to understand the essentials and I will help you.

Initialization

The generated service worker file (let’s call it for instance) get configuration by text interpolation when ` populating .

// service-worker.tmpl
var precacheConfig = <%= precacheConfig %>;
// sw.js
var precacheConfig = [
["js/a.js", "3cb4f0"],
["css/b.css", "c5a951"]
]

It’s not difficult to see that it’s a list of relative urls and MD5 hashes. In fact, one thing that do in the build time is to calculate hash of each file that it asked to “precache” from parameter.

In , would be transformed into a ES6 Map with structure as below. Noticed that I omit the origin part (e.g. ) for short.

> urlToCacheKeys
< Map(2) {
"http.../js/a.js" => "http.../js/a.js?_sw-precache=3cb4f0",
"http.../css/b.js" => "http.../css/b.css?_sw-precache=c5a951"
}

Instead of using raw URL as the cache key, sw-precache append a to the end of each URL when populating, updating its cache and even fetching these subresouces. Those are what we called cache-busting parameter*. It can prevent service worker from responding and caching out-of-date responses found in browsers’ HTTP cache indefinitely.

Because each build would re-calculate hashes and re-generate a new with new containing those new hashes, can now determine the version of each subresources thus decide what part of its cache needs a update. This is pretty similar with what we commonly do when realizing long-term caching with webpack or gulp-rev, to do a byte-diff ahead of runtime.

*: Developer can opt out this behaviour with option if they set HTTP caching headers right. More details on Jake’s Post.

On Install

ServiceWorker gives you an install event. You can use this to get stuff ready, stuff that must be ready before you handle other events.

During the lifecycle, open the cache and get started to populate its cache. One cool thing that it does for you is its incremental update mechanism.

Sw-precache would search each cache key (the values of ) in the , a ES6 Set containing URLs of all requests indexed from current version of cache, and only and resources couldn’t be found in cache, i.e, never be cached before, thus reuse cached resources as much as possible.

If you can not fully understand it, don’t worry. We will recap it later, now let’s move on.

On Activate

Once a new ServiceWorker has installed & a previous version isn’t being used, the new one activates, and you get an event. Because the old version is out of the way, it's a good time to handle schema migrations in IndexedDB and also delete unused caches.

During activation phase, would compare all existing requests in the cache, named (noticed that it now contains resources just cached on installation phase) with , a ES6 Set from the values of . And delete any requests not matching from cache.

// sw.js
existingRequests.map(function(existingRequest) {
if (!setOfExpectedUrls.has(existingRequest.url)) {
return cache.delete(existingRequest);
}
})

On Fetch

Although the comments in source code have elaborated everything well, I wanna highlight some points during the request intercepting duration.

Should Respond?

Firstly, we need to determine whether this request was included in our “pre-caching list”. If it was, this request should have been pre-fetched and pre-cached thus we can respond it directly from cache.

// sw.js*
var url = event.request.url
shouldRespond = urlsToCacheKeys.has(url);

Noticed that we are matching raw URLs (e.g. ) instead of the hashed ones. It prevent us from calculating hashes at runtime, which would have a significant cost. And since we have kept the relationship in it’s easy to index the hashed one out.

*In real cases, sw-precache would take and options into consideration.

Navigation Fallback

One interesting feature that sw-precache provided is (previously ), which detect navigation request and respond a preset fallback HTML document when the URL of navigation request did not exist in .

It is presented for SPA using History API based routing, allowing responding arbitrary URLs with one single HTML entry defined in , kinda reimplementing a Nginx rewrite in service worker*. Do noticed that service worker only intercept document (navigation request) inside its scope (and any resources referenced in those documents of course). So navigation towards outside scope would not be effected.

* can be provided to limit the “rewrite” scope.

Respond from Cache

Finally, we get the appropriate cache key (the hashed URL) by raw URL with and invoke to respond requests from cache directly. Done!

// sw.js*
event.respondWith(
caches.open(cacheName).then(cache => {
return cache.match(urlsToCacheKeys.get(url))
.then(response => {
if (response) return response;
});
})
);

*The code was “ES6-fied” with error handling part removed.

Cache Management Recap

That’s recap the cache management part with a full lifecycle simulation.

The first build

Supposed we are in the very first load, the would be a empty set thus all subresources listed to be pre-cached would be fetched and put into cache on SW install time.

// cachedUrls
Set(0) {}
// urlToCacheKeys
Map(2) {
"http.../js/a.js" => "http.../js/a.js?_sw-precache=3cb4f0",
"http.../css/b.js" => "http.../css/b.css?_sw-precache=c5a951"
}
// SW Network Logs
[sw] GET a.js?_sw-precache=3cb4f0
[sw] GET b.css?_sw-precache=c5a951

After that, it will start to control the page immediately because the would call ` by default. It means the will start to intercept and try to serve future fetches from caches, so it’s good for performance.

In the second load, all subresouces have been cached and will be served directly from cache. So none requests are sent from .

// cachedUrls
Set(2) {
"http.../js/a.js?_sw-precache=3cb4f0",
"http.../css/b.css?_sw-precache=c5a951"
}
// urlToCacheKeys
Map(2) {
"http.../js/a.js" => "http.../js/a.js?_sw-precache=3cb4f0",
"http.../css/b.js" => "http.../css/b.css?_sw-precache=c5a951"
}
// SW Network Logs
// Empty

The second build

Once we create a byte-diff of our subresouces (e.g., we modify to a new version with hash value ) and re-run the build process, a new version of would be also generated.

The new would run alongside with the existing one, and start its own installation phase.

// cachedUrls
Set(2) {
"http.../js/a.js?_sw-precache=3cb4f0",
"http.../css/b.css?_sw-precache=c5a951"
}
// urlToCacheKeys
Map(2) {
"http.../js/a.js" => "http.../js/a.js?_sw-precache=d6420f",
"http.../css/b.js" => "http.../css/b.css?_sw-precache=c5a951"
}
// SW Network Logs
[sw] GET a.js?_sw-precache=d6420f

This time, see that there is a new version of requested, so it fetch and put the response into cache. In fact, we have two versions of in cache at the same time in this moment.

// what's in cache?
http.../js/a.js?_sw-precache=3cb4f0
http.../js/a.js?_sw-precache=d6420f
http.../css/b.css?_sw-precache=c5a951

By default, generated by sw-precache would call so it would take over the page and move onto activating phase immediately.

// existingRequests
http.../js/a.js?_sw-precache=3cb4f0
http.../js/a.js?_sw-precache=d6420f
http.../css/b.css?_sw-precache=c5a951
// setOfExpectedUrls
Set(2) {
"http.../js/a.js?_sw-precache=d6420f",
"http.../css/b.css?_sw-precache=c5a951"
}
// the one deleted
http.../js/a.js?_sw-precache=3cb4f0

By comparing existing requests in the cache with set of expected ones, the old version of would be deleted from cache. This ensure there is only one version of our site’s resources each time.

That’s it! We finish the simulation successfully.

Conclusions

As its name implied, sw-precache is designed specifically for the needs of precaching some critical static resources. It only does one thing but does it well. I’d love to give you some opinionated suggestions but you decide whether your requirements suit it or not.

Precaching is NOT free

So don’t precached everything. Sw-precache use a “On Install — as a dependency” strategy for your precache configs. A huge list of requests would delay the time service worker finishing installing and, in addition, wastes users’ bandwidth and disk space.

For instance, if you wanna build a offline-capable blogs. You had better not include things like in . It would be a huge disaster to data-sensitive people if you have hundreds of posts. Use a Runtime Caching instead.

“App Shell”

A helpful analogy is to think of your App Shell as the code and resources that would be published to an app store for a native iOS or Android application.

Though I always consider that the term “App Shell” is too narrow to cover its actual usages now, It is widely used and commonly known. I personally prefer calling them “Web Installation Package” straightforward because they can be truly installed into users’ disks and our web app can boot up directly from them in any network environments. The only difference between “Web Installation Package” and iOS/Android App is that we need strive to limit it within a reasonable size.

Precaching is perfect for this kinda resources such as entry html, visual placeholders, offline pages etc., because they can be static in one version, small-sized, and most importantly, part of critical rendering path. We wanna put first meaningful paint ASAP to our user thus we precache them to eliminate HTTP roundtrip time.

BTW, if you are using HTML5 Application Cache before, sw-precache is really a perfect replacement because it can cover nearly all use cases the App Cache provide.

This is not the end

Sw-precache is just one of awesome tools that can help you build service worker. If you are planing to add some service worker power into your website, Don’t hesitate to checkout sw-toolbox, sw-helper (a new tool Google is working on) and many more from communities.

That’s all. Wish you enjoy!

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store