The Many Offline Options for iOS Apps

Offline mode is no longer just an extra feature you could choose to add to your app — it’s something many users expect. I’ve often seen developers force their favorite offline solution on a problem that could be solved in a better way.

There are many different ways of making an app work offline, each with advantages and drawbacks. There’s no magic bullet that will work every time, so it’s up to you to carefully weigh your options. Sometimes, the best solution may be a combination of several different technologies!

The main solutions available to make the app work offline are:

There are many different ways to implement an offline mode, and you should always try to select the right tool for your app. But first, why should you make your app work offline?

Why is offline important?

First of all, speed matters. 40% of users abandon pages which take more than 3 seconds to load. People expect apps will be responsive and snappy. If they see a loading spinner for too long, they’ll just leave your app and open up one of the many other attention grabbers on their phone. But, if your app works offline, it will be fast even in poor network connectivity.

Second, Business2Community estimates that 15% of the app usage at any time in the US is offline, and in international markets, this is likely to be even higher. You users may be on a plane, in a cafe with a flaky connection, or on the subway, and their internet could be coming in and out. If your app doesn’t work in these conditions, they will simply stop using your app.

Caches vs. Databases

When implementing an offline solution, the first step is to decide between a cache or a database. Databases can be powerful, but are overused by developers. Caches are simpler and are a better fit for many applications.

The most common database solutions are Core Data and Realm, though many people prefer to use SQLite directly. You define models, load data from the network, and insert it into your database. Then, your view controllers listen to queries on the database and update whenever the data changes.

When implementing a cache, you load data from the cache and the network in parallel. The cached data doesn’t need to be structured and can simply be a serialized version of the data from the network. This means that loading data from the network and the cache look identical from a code perspective. There are many open source implementations of caches; PINCache and NSURLCache are two common options.

Databases are good if:

  • You can download all of the data for a user and store it locally without using too much disk space.
  • You can easily write logic that limits the data used on the device.
  • You need to be able to do simple, local searches over thousands of records.

Caches are good if:

  • You cannot download all the data for a user and need a simple eviction strategy.
  • You’ve already written logic to download data from the network and want to add in an offline mode without rewriting your application.
  • You want a simpler, lightweight, and flexible solution.

The problems with databases:

  • They are notoriously difficult to get right. You should expect to spend a decent amount of developer time dealing with crashes in this part of your application.
  • You need to write models and migrate them whenever something changes.
  • Deleting models to free up space is really difficult as it’s impossible to tell which models are currently in use by screens.

Despite the downsides and overuse of databases, for some applications, it’s a great option. For example, a database would work well for a podcast player. You can download a reasonable number of upcoming episodes, and it’s easy to know when to delete models (once the user finishes an episode). Another example is a saving a game, as the user will never want to have a saved game deleted (unless they delete it manually).

On the other hand, a social media application would be much better suited with a cache. It’s impossible to download all the content for a user, and as they browse, a database would get larger and larger. In a similar vein, a news reader application would be much better with a cache. As the user browses new articles, a database wouldn’t stop growing. A cache easily stores everything you’ve recently viewed and automatically purges older articles. Additionally, you never need to migrate models or worry about data faults.

Normalized vs. Denormalized Data

There are two main ways to store data in a cache — using normalized or denormalized data. Most application data can be visualized as a tree-like structure. It has a root model and then several child models (which then in turn could have more child models). When caching this data, you could simply store the entire tree as one entry in the cache. This will look something like this:

A denormalized or ‘tree-like’ cache

This is the simplest way to use a cache. You can implement it with just a few lines of code and it’s easy to pick unique ids for objects (for instance, you could store each object by the URL you used to fetch it).

However, you’ll notice that the author object is inserted twice into the database as a nested object. This means if you update the article with id=3 with an updated author object, next time you read article with id=42, you’ll get the old author. If this type of consistency is important, you could consider ‘normalizing’ your data before you cache it. This means ripping apart the tree into submodels and caching each model under a unique id. For example:

A normalized cache

Note that this is not a database! Each entry is still a blob of unstructured data keyed by an id. This means you still won’t need migrations when models change and eviction is simple. However, implementing this system is not simple as you also need to write logic to reconstruct models and have unique ids for each submodel.

Note: because of the overhead of storing the dotted line ‘pointers’ above, this solution actually doesn’t use less space than a denormalized cache. See the ‘Additional Tips’ section for more info.

I’ve previously written about using this strategy for the LinkedIn App, and the code that powers consistency amongst models is open-source. Using this solution, adding offline behavior was easy for new pages, and we never had to migrate models or clean up large databases.

Downloading Data from a Server

There are a few options for retrieving data from the network and the cache or database.

The simplest is to always retrieve data from the cache and network in parallel. If the cache returns first, call your completion block with the cached data. Then, once the network finishes, call the completion block with the new data. This allows you to contain most of the caching logic in the network layer and all the view controllers need to do is display the data.

However, this approach will sometimes call the completion block twice. For some apps, this may be problematic, so you could consider only using the cached data if the network is slow. You only display the cached data if the network fails or takes too long to load. I’d recommend setting the timeout for this to be short. Remember that when offline, requests will often timeout after a minute instead of erroring immediately.

Another solution is to place your cache or database between your view controllers and network. In this ‘reactive’ model, you immediately retrieve models from the database and listen for changes. Once the network request finishes, it edits the database and the view controller automatically updates with the changes. This approach has been well-documented. It’s a simple mental model, but listening to model changes in all view controllers is sometimes not trivial and leads to a lot of boilerplate code.

Finally, you can use a database sync solution such as Realm Platform or Firebase. With these platforms, you host a version of the database on your server and changes are automatically synced to clients in a data-efficient way. These solutions implement a lot of complexity, but this complexity is hidden in a black box and has nontrivial requirements on how you structure your server.

Uploading Data

When users change data in an application, it’s difficult to decide how this should work in low network connectivity. The simplest way is to show a spinner and wait for the request to finish. You can block the UI and if necessary, alert the user that something failed with a banner or alert. However, this is isn’t the best user experience. I still cannot believe that iMessage won’t allow me to send messages when offline.

iMessage doesn’t work offline? What is this? 2014?!

A common solution to this problem is to create a queue for offline tasks. If a request fails because of low connectivity, add it to a disk-backed queue and try again when you are connected to the internet. While this sounds simple in theory, in practice, there are a lot of edge cases you need to consider:

  1. How many times should you retry a request?
  2. Is the ordering of requests important?
  3. If a request finally fails after a number of retries, how do I inform the user?
  4. How do I let the user know that a request is pending and hasn’t yet synced?

On top of all this, you need to revert the data if an upload fails. Consider the following flow:

  1. The user likes something on their phone which is offline.
  2. The user likes the same thing on their laptop which is online.
  3. The phone connects to the internet, and downloads new data for this item (which says it is liked).
  4. Next, uploading this like on the phone fails for some reason.
  5. The phone reverts the item to no longer be liked.

Notice how the phone ends up in the wrong state! Even worse, since the phone doesn’t know it’s not in the right state, it may not refresh this from the network. When reverting data, you can’t simply take the opposite of an action. You need to revert to the last thing the server told you was correct. This means you now need to store both your current state and the last seen server state.

These problems are still solvable. When I was at Superhuman, we built an architecture we called a ‘modifier queue’ that solved all of these edge cases. You can read more details on this on the Superhuman blog.

Likely for your app, you don’t need to go as far as implementing a modifier queue. Uploading data is difficult, and you should think through which use cases you care most about and concentrate on those being great experiences.

Consistency

Once you’ve edited your cache or database locally, you may need to update multiple screens in the app. This is a deep topic that is way beyond the scope of this post, but here are a few techniques to consider:

  • Listening to Database Changes
    Most databases offer the ability to listen to models changing. If you’re using a database instead of a cache, this is likely the simplest solution.
  • Delegation
    If a change only affects a couple of view controllers in your app that have a relationship, you can simply use a delegate to propagate the change. Though you can only use this if the change doesn’t appear elsewhere, this is a very simple solution and you shouldn’t overlook it if it just works.
  • An Event Bus
    One of the simplest ways to implement changes to a cache is to use an event bus system to notify all listeners of changes. You simply listen to changes to a certain id and if it changes, reload data from the cache and refresh the view. NSNotificationCenter is the first party solution, but I’d recommend looking at some open source projects that offer a better API as well as typesafe notifications. SwiftNotificationCenter and SwiftEventBus are two popular examples.
  • Rocket Data
    Rocket Data manages the consistency for immutable models and is specifically designed to work well with a cache. It simply listens to changes on certain ids and notifies you when a model has changed. It’s a more powerful solution than the event bus and allows you to use immutable models, but also requires more work to set it up. For more information, see my talk on Rocket Data or the docs.

Combining Techniques

I recently added an offline mode to my Chess Tactics App. I had three use cases for offline usage, each with slightly different requirements:

  • Users should be able to view all pages in the app offline if they’ve visited them recently.
  • Users should be able to explicitly save puzzles to view offline. These should never be evicted.
  • The main feature of the app presents a random puzzle to the user curated for their skill level. This should work offline.

Instead of picking one architecture and trying to fit all of these use cases to it, I decided to pick three different offline techniques!

For viewing arbitrary pages, I simply added a key-value store cache keyed off the URL for the request. I chose a denormalized cache since I didn’t have many submodels in my requests, and submodel consistency isn’t important to my application. Even though I had already coded the entire app, adding this was trivial and only took a couple of hours.

Playing puzzles works offline! The rating change is added to a queue and the requests are sent when the user comes back online.

For saving puzzles, I used a structured database. I never wanted these evicted and there was no chance that the user could save too much data since an individual puzzle is only a few KB. A structured database also allows the flexibility to sort or filter these puzzles in the future.

For presenting a random puzzle to the user, I also used a database, but instead of using a structured database, I simply used an NSData blob to store the JSON. In the background, the app downloads about 200 puzzles and stores them. If the user is offline, the app selects one randomly (an operation that is difficult with a cache), presents it to the user, and then deletes it. Using a database fit my needs as I could easily control its size. Storing the data as a blob instead of using separate fields allowed me to simplify my code and avoid any additional parsing logic. Contrary to popular belief, storing data as blobs is very fast with a database as long as the blobs are small.

When selecting a random puzzle offline, my view controller never used database models directly. Instead, I parsed the data blob into an in memory model before passing it to the view controller. This way, my view controller was completely isolated from any disk storage logic, and I could safely delete these models at any time.

For uploading data, I implemented a simple queue to upload user rating changes. Since the app is read-heavy, I decided it wasn’t worth it to implement other POST requests. The app simply shows a message saying: ‘your rating will update when you are online.’

The key point here: you shouldn’t force every feature to use the same offline solution. By combining different techniques, you can provide a better user experience and reduce development time.

Conclusion

If nothing else, I hope you remember that there are many different solutions for making an app work offline. Don’t just start banging your head against a Core Data implementation immediately because it’s what you’re familiar with; think hard about the problem you’re trying to solve before selecting the best solution.

For many apps, a cache is a better choice, because it’s a simpler solution. It offers an easy method for eviction, it doesn’t require model migrations, and doesn’t crash all the time.

Regardless of your chosen technology, you also need to consider how you are going to upload changes and keep your app consistent as part of your offline strategy. And finally, you shouldn’t box yourself into just one technology. Often, a combination of different techniques will be the best solution.


Additional Tips

Where should you save data?

iOS actually provides a cache directory for storing cache data. However, this directory is cleared if the device is running low on space. This is a great thing if it’s actually a cache, but not if you’re relying on this data for basic functionality. You can also put files in the tmp folder for even shorter lived files.

Offline Detection on iOS is very Flaky

Apple offers a reachability API that notifies you when connection to the internet changes in some way. However, this API is flaky and you shouldn’t rely on it for infrastructure logic. Internally, it never actually makes network requests. It just tells you when the hardware could connect to the internet. You can use this API for UI to show to the user, but you should never avoid sending a request because of this API — the only way to know if you’re actually able to reach your server is to try the request. Jared Sinclair recently wrote a good article on this.

Normalized Caches Don’t Use Less Data

As I mentioned earlier, normalized caches don’t necessarily use less data than denormalized caches. Though they don’t duplicate models, there is overhead to store data separately. Consider the following trivial example:

// Denormalized data
{ id: 'parent', child: { id: 'child' }}
// Normalized data
{ id: 'parent', child: 'child' }
{ id: 'child' }

Notice how in the second example, the id of the child model is duplicated meaning that the total bytes used by the normalized data in this instance is actually higher. This seems small, but if you have a lot of small, nested models, it really adds up quickly. When working on the LinkedIn app, we found that normalized data took up about 5–10% more space than denormalized data. Because of this, we decided to send denormalized data over the network where size matters.

Of course, this is dependent on your specific dataset. There’s a tradeoff here, so measure before you decide!

The Fastest Cache is SQLite!?

Ironically, the fastest cache I’ve found that’s readily available for iOS development is actually SQLite. You can simply create a model with three fields: id, data, and timestamp. After creating an index on id, lookup and storage is faster than the NSFileManager solution used by many open source solutions. As far as I know, there’s no open source library that implements this API and eviction logic, but if you’re interested in creating one, let me know. I’d be interested in contributing.