Building an offline Web App that works in very low internet conditions using ServiceWorkers

Hari krishna
Progressive Web Apps
5 min readJun 16, 2017

--

Building offline functionality for web applications has become very easy nowadays. Especially by the introduction of serviceWorkers and open source libraries. But the tricky part is how well these web applications work in very bad internet conditions. In this article, I am going to share the learnings I had and things to consider while developing a reliable offline web application that works in low intermittent internet condition.

Challenges

  • Offline functionality — App should be able to work completely offline once installed.
  • Syncing local data to the server — Merge strategies to sync local data and remote data without any loss.
  • Intermittent internet condition — App should reliably sync data even with the bad internet connection.

Storage

The difference between normal web application and offline web application is the way they get data. For offline web app, it will depend on offline storage APIs and for normal web app it will depend on server. IndexedDB provides better API compared to localStorage API to store data in the browser. Your web application should depend on IndexedDB or similar other alternatives for data and it shouldn’t do any HTTP calls, as it might fail in offline scenarios. Keep in mind that under low storage situations, browsers might clear indexedDB. To fix this problem, you might need to use persistent storage API which was introduced in chrome55.

Offline functionality

ServiceWorkers come into rescue to serve your web application in offline mode. It works like a client side proxy. It can intercept the requests coming from clients (Ex: window, web worker and shared worker). It has a well defined life cycle.

Google built two awesome libraries named ‘sw-precache’ and ‘sw-toolbox’. You can use sw-precache to cache static content and sw-toolbox to cache dynamic content. These two libraries are helpful in implementing your own offline strategies based on your application needs. By default, sw-precache downloads files in parallel, we modified sw-precache template to download static assets one-by-one to work reliably in intermittent internet conditions.

Persistent Message Passing

Once the user starts using your app in offline, there would be lot of user actions and data you need to sync with server. This sync can happen at any time depending on network connectivity. We can store these user actions in indexedDb as jobs to sync and whenever network is available, we can start processing these jobs one-by-one. We used hustle to achieve this. We forked hustle and converted it to angular module (since our app is built on angular) and implemented retry mechanism inside it.

Data Sync

Once we store offline data in the browser, we need to look for an opportunity to upload local data to the server whenever the network is available. During the upload, we need to take care of data consistency. We used two merge strategies to sync data.

  1. Merge by lastUpdated time - To merge entities
  2. Merge by union - To merge a collection of entities

For example, consider two users trying to edit an article in offline. Article entity in JSON looks something like this.

{
id: "randomId",
title: "offline web apps",
content: "large text",
lastUpdated: "2017-06-08T18:29:11.142Z"
external_links: [{
id: "anchor-text-id1",
link: "http://google.com"
}, {
id: "anchor-text-id2",
link: "http://facebook.com"
}]
}

When both the users come online, there can be multiple scenarios. Here are some of the scenarios likely to encounter.

  • When two users add different links - In this case, we can use merge by union. Once the sync completes, external_links collection inside the article should have all the links.
  • When two users edits content of an article - In this case, we can use merge by lastUpdated. We need to check what is the latest edit by using lastUpdated attribute inside article object and merge the article object.

Every Byte counts

While building an offline web app, most probably you end up using APIs to communicate with server. HTTP calls with large payloads have high chance of failing in low/intermittent internet conditions. Our ultimate goal is to reduce the payload size as much as possible so that it won’t consume too much bandwidth and success rate would be higher. Try to avoid duplicate downloads and use lastUpdated timestamps to download only changed data (delta) instead of full payload.

Look at the payloads you are downloading and uploading to the server. Are you using every piece of information in your application from the payload? It might be just an extra attribute in case of JSON payload. Try to cut down the information which is not needed by your application.

For example I am trying to get the user details by using the following URLhttp://localhost:8080/users/me.json

It might return payload something like this

{user: {
name: "Hari",
username: "harikris",
lastUpdated: "2017-06-08T18:29:11.142Z",
access: {
read: true,
write: false
}
....
}
}

You might not be using access field in the payload. Or you just care about username and name fields. In that case you can change your URL in such a way that, it only returns username and name instead of default payload

GraphQL is one of the examples where you can build APIs much easier and yet powerful.

Parallel Vs Sequential

Browsers normally send requests in parallel to download assets in a web page. This same behavior might not work completely in very low internet conditions. If you send requests in parallel (throughput will increase for low bandwidth) then there is a high chance of failing most of the HTTP calls. If you send requests one-by-one then most of the calls might go through. Parallel HTTP calls will be processed in less time but it’s not reliable as there is a high chance of failure. Sequential HTTP calls will take more time, but the success rate would be higher. You can take this decision based on your application needs.

Retry mechanism

No matter how well you optimize the API calls in your app, there is always a chance of HTTP call failures due to network/power. You need to have a rescue mechanism to retry these calls in case of failure. We implemented a retry mechanism inside hustle, if any HTTP call fails it retries for specific number of times in certain durations.

I tried to put the concepts as generic as possible, so that it can be applied in most of the cases. Hope this helps.

Follow me on twitter.

--

--