Service Workers — Gotchas

Boopathi Rajaa
10 min readMay 9, 2016

--

This article requires some understanding of ServiceWorkers. If you’re not familiar with ServiceWorkers, here are some links to help you get started —

Also, here is a good read —

Let’s discuss some of the gotchas when developing a web application with offline capabilities using service workers.

For the sake of brevity, sw = service workers.

3xx — Almost Never

Whenever a URL in your application possibly returns a 3xx response, make sure you know what you’re doing when you handle that request via a Service Worker. The 3xx are a useful way to add some side effects to modify your persistent state (on the server, session entry or some token entry to db which you’re planning to use later, and on the client, cookies) and let the user move on.

An example is when you have a separate login page for your application.

  1. When the user visits /login, s/he expects to see the login page
  2. When the user visits /login when s/he is already logged in, s/he should automatically navigate to the Homepage or Dashboard or similar.

Do you see the problem already? If so, High Five!!!

The 3xx itself is not the problem. The side effect it causes is the problem we should be aware of and handle.

What happens when you cache the login page in sw?

  • User enters /login for the first time. Login page is cached by sw
  • User completes login, and s/he is in some other page of the app (let’s not talk about the other 3xxs that you usually do here)
  • Some random link takes user to /login when offline or flaky connection
  • CacheFirst (ReadThrough), NetworkFirst or any approach that involves response from Cache will result in her/him seeing the Login page when s/he is already logged in.

This is one of the corner cases. But it’s an important gotcha that you’ve to plan and handle right from the beginning so that your sw code doesn’t get too complicated or convoluted. One simple solution to the problem is to NOT cache the page that can possibly throw a 3xx. But, it’s not that straightforward to identify. Inside a sw, the only way to make a request is to use the fetch API. Chrome was the first browser to implement service workers and fetch API. Between Chrome 42 and Chrome 46, the sw spec and the fetch standard kept changing — new features were added and the defaults were changing. So if your users are using this range, take a note of this inconsistency —

Simply don’t cache

For the pages that can possibly throw a 3xx (i.e. it returns a 3xx for some requests and 2xx for some other requests), simply don’t cache stuff.

Though it is possible to respond with fetch(request), like —

self.onfetch = function(event) {
if (event.request.url === '/login')
event.respondWith(fetch(event.request));
};

or with sw-toolbox,

toolbox.get(‘/3xx’, toolbox.networkOnly);

it has had its issues in the first half of 2015. Fetching a 3xx in service worker resulted in this —

  1. Browser URL = /301. Service worker handles /301.
  2. SW: fetch(‘/301'). fetch does the redirect to /200. The .then gets content for 200.
  3. event.RespondWith(contentFor200)

So, what’s the problem here?

The browser URL still remains as /301.

Oops! This is not the expected behaviour. The browser URL should have changed to /200. This happens in Chrome 42, 43 and one or two more versions. So, if your users use this range, take a note of this before you implement service workers for 3xxs.

Can we fix it with some code and dirty checking?

Apparently, it’s not so simple. In these versions, the fetch standard says that if the redirect mode is set to “manual”, the behaviour was to change to “follow” by default. And it was removed only in July 2015 by this commit. So, this gives us no way to find if a request threw a 3xx by tapping into the request-response lifecycle before the redirection is made. But now, with the latest spec, it is possible to use redirect=manual and know whether it returned a 3xx response (will discuss in later parts of this article).

Coming back to the same problem, there are some clues to know whether it was a redirect or not.

1. request.url !== response.url

If the request.url and the response.url does not match, then there is a redirection. But then you’ve to pass this information back to the Document from the serviceWorker and make the browser make a request to the resultant url. This will result in requesting the same resource twice. You can be intelligent about it, but it just ends up in code smell.

2. Replace all 3xx with 200

One other option is to remove the 3xxs from the server completely by replacing it with 200s with custom headers. Here you’ll be parsing custom headers in your sw code.

function redirect(res, url) {
res.header('X-Redirect-To', url);
res.end();
}
app.get('/301', function(req, res) {
redirect(res, '/200');
});

3. 3xx via 200 and a script tag

Same as the previous approach, but adding a script tag to do the redirection. Send a 200 response with the response body that will rewrite the url.

function redirect(res, url) {
res.header('X-Redirect-To', url);
res.end(`<script>location.href=${JSON.stringify(url)}</script>`)
}
app.get('/301', function(req, res) {
redirect(res, '/200');
});

There is no straightforward way to get this to work on the browsers that was released with sw support in the first 3 quarters of 2015. So in my opinion, these approaches bring you to a point that gets stuff done using a hack. While this is not the best thing to do, it’s an option you can consider. If you find some other way to make it work, kindly leave a comment. Thanks!

Note: For latest versions of all browsers that follow the latest spec or spec after mid 2015, you don’t have to worry about the above 3 things. It just works.

The second part of the problem with fetch and sw is detecting a redirect. Let’s say /301 returns a, well, 301 to /200

fetch('/301')
.then(response => {
assert(response.status === 200);
});

Note that the response status is not 301. And it is the expected behaviour.

RequestRedirect

To change the above functionality and get more control, we have the RequestRedirect. redirect is a RequestInit option using which you can specify the redirect mode to “follow”(default), “error”, or “manual”. The manual redirect mode allows you to detect the 3xxs.

fetch(‘/301’, {
redirect: ‘manual’
}).then(resp => {
assert(resp.type === 'opaqueredirect');
});

One important thing to note here is that — while tapping into the 3xxs gives you more control, it opens up a potential security problem. So you cannot simply get all the data you want from the 3xx. You only get one clue —type= ‘opaqueredirect’ and all other values are nullified — status=0, headers=null, body=null, etc… A brief of the security issue is in the spec —

So with this type=‘opaqueredirect’, you can now know whether a request returned a redirect and you can decide whether to cache or not based on some logic. But there is one gotcha — this is not available in that version range discussed above. You’re better off with a hack than relying on making response redirect work. I’m discussing about this here to let you know that there is a neat way to tap into redirects. You can make use of it in case you’ve no other option.

Caching a 3xx response doesn’t make sense

Seriously! Think of it as highly opinionated, but trust me, when you make a web app, you’ve to know which URLs can return a 3xx and make sure you don’t ever cache the 3xx response. It simply doesn’t make sense to cache an opaqueredirect response. Also, an opaqueredirect response is treated like an error in fetch.

More Discussions

Hopefully, this gives you a basic idea. Here are a few more discussions on the same topic, and you’ll probably find a lot more information here —

Include credentials

When you make a request for which the server expects some cookies, by default, cookies are NOT sent. And when you make a request for which the server sends some cookies, by default, the response’s Set-Cookie header doesn’t reach the Browser’s Document. This can be fixed by adding the include credentials option in RequestInit.

For example,

fetch('/200')

the server does NOT see the cookies. And when the server sends a Set-Cookie Header, the fetch here in the SW kinda swallows these cookies

self.onfetch = function(event) {
event.respondWith(fetch(event.request));
};

and they don’t reach the Document context.

For things that require cookies in their transactions, you need to explicitly tell in the Request to include the credentials.

fetch(url, { credentials: 'include' });

Response Race

While CacheFirst and NetworkFirst strategies operate based on one fails and the other kicks in, there is one more important strategy for certain things. In sw-toolbox it’s called — ‘fastest’ where cache and network are requested simultaneously using Promise.race.

When the request goes out for the first time — race(cache, network),

First time — Cache fails. Network hits and updates cache

and for all other requests,

All other requests — Cache succeeds first, Network updates cache.

The documentation says it all —

Request the resource from both the cache and the network in parallel. Respond with whichever returns first. Usually this will be the cached version, if there is one. On the one hand this strategy will always make a network request, even if the resource is cached. On the other hand, if/when the network request completes the cache is updated, so that future cache reads will be more up-to-date.

Just A Silly Mistake

Here are some common silly mistakes that you can catch early and possibly avoid all the pain.

url

When you bring in one more layer in your request response cycle, you might miss out on some of the parameters. A reminder to you that url is not the only key in a request. It’s not uncommon to make this silly mistake —

var url = event.request.url;
if (url.indexOf('something'))
return event.respondWith(fetch(url));

Note the `fetch(url)`. It should simply be `fetch(event.request)`.

Streams!!!

The request and response are streams. So whenever you’re consuming the body of a stream, you have to make sure that you clone it before consuming it. This is because the body of the stream can be consumed only once. Cloning the Request and Response streams are as simple as —

var reqClone = event.request.clone();
var respClone = response.clone();

SW Kill Switch

When you get your ServiceWorker wrong, it’s hard to make it right. What does it even mean? Say you’ve made a mistake that all your pages are broken when it passes through sw fetch and a stale cache always kicks in. This is bad and one way to fix it on your system is to kill the serviceworker from devtools. But when you ship it to users, you can’t ask all of your users to kill the serviceworker or force refresh your page a few times whenever you correct some mistake.

A Service Worker updates itself by checking the byte diff with the new service worker when available. It tries this update process upon navigation respecting the freshness headers and the sw explainer gives a detailed description of the same. For simplicity, let’s say the automatic update happens once in ~24 hours. And to update a service worker all you need to do is to change the contents of the file so that there is a diff. A simple —

const VERSION = 42;

will be enough. Incrementing this number will update the service worker in next load. This is mainly useful for cache invalidation —

const CACHE_NAME = 'my-awesome-site-cache-' + VERSION;

When you use this ‘VERSION’ as a part of the Cache name, it invalidates the previous cache completely and new caches are populated. This undocumented feature is called the Service Worker Kill Switch.

Added to the VERSION parameter, you can also call the update method in ServiceWorkerRegistration object —

navigator
.serviceWorker
.register('/sw.js')
.then(registration => {
button.onClick = () => registration.update();
window.onSomethingHappened = () => registration.update();
})

registration.update() will try for a new update of service worker bypassing the ~24 hour update check. Note that the byte diff requirement still applies.

Kill Switch 1

If you see some users getting cross site scripting attacks and you want to clear all the site’s information on the user’s device, this Spec comes in handy and it’s called the Kill Switch —

Sending the header ‘Clear-Site-Data’ is a useful way to mitigate these attacks. Also, from an example in the spec —

Installing a Service Worker guarantees that a request will go out to a server every ~24 hours. That update ping would be a wonderful time to send a header like this one in case of catastrophe.

Don’t generate dynamic sw code

If you’re going to generate dynamic sw code, say https://example.com/sw.js?foo=bar, don’t do it, because it brings in potential XSS attacks, say ‘bar’ is used in the script.

That’s all Folks!

There are a few other gotchas that I probably missed. Will write a Part-2 for the same. The epilogue is that there are some hidden things that are not obvious for the first timers, and it might require reading some parts of the Specification a few times. It helps both ways, you can improve your code as well as help improve the specification. Here is the spec —

--

--

Boopathi Rajaa

A web developer with unhealthy interest in JavaScript and Go