Introduction to Jetpack DataStore

Published in

Android Developers

6 min readJan 18, 2022

DataStore is a Jetpack data storage library that provides a safe and consistent way to store small amounts of data, such as preferences or application state. It’s based on Kotlin coroutines and Flow which enable asynchronous data storage. It aims to replace SharedPreferences, as it is thread-safe and non-blocking. It provides two different implementations: Proto DataStore, which stores typed objects (backed by protocol buffers) and Preferences DataStore, which stores key-value pairs. Going forward, when we just use DataStore, this refers to both implementations, unless specified otherwise.

In this blog post, we will take a closer look at DataStore — how it works, what implementations it provides and their individual use cases. We’ll also look at what benefits and improvements it brings over SharedPreferences and why these make DataStore worth your while.

DataStore vs SharedPreferences

Most likely you’ve used SharedPreferences in your apps. It is also likely that you’ve experienced issues with SharedPreferences that are hard to reproduce - seeing odd crashes in your analytics due to uncaught exceptions, blocking the UI thread when making calls or inconsistent, persisted data throughout your app. DataStore was built to address all of these issues.

Let’s take a look at a direct comparison between SharedPreferences and DataStore:

Comparing DataStore implementations with SharedPreferences

Async API

With most data storage APIs, you often need to get notified asynchronously when data has been modified. SharedPreferences does offer some async support, but only for getting updates on changed values via OnSharedPreferenceChangeListener. However, this callback is still invoked on the main thread. Similarly, if you want to offload your file saving work to background, you could use SharedPreferences apply(), but keep in mind that this will block the UI thread on fsync(), potentially causing jank and ANRs. This could happen any time a service starts or stops, or an activity pauses or stops. In comparison, DataStore provides a fully asynchronous API for retrieving and saving data, using the power of Kotlin coroutines and Flow, reducing the risk of blocking your UI thread. For those unfamiliar with Kotlin Flows, it’s just a stream of values that can be computed asynchronously.

Synchronous work

SharedPreferences API does support synchronous work out of the box. However, its synchronous commit() for modifying persisted data may appear safe to call on the UI thread, but it does in fact perform heavier I/O operations. This is a risky scenario that could, and often does, lead to ANRs and UI jank. To prevent this, DataStore does not offer ready-to-use synchronous support. DataStore saves the preferences in a file and under-the-hood performs all data operations on Dispatchers.IO, unless specified otherwise, keeping your UI thread unblocked.

However, it is possible to combine DataStore and synchronous work with a bit of help from coroutine builders, as we’ll see later.

Error handling

SharedPreferences can throw parsing errors as runtime exceptions, leaving your app vulnerable to crashes. For example, the ClassCastException is a commonly occurring exception thrown by the API when the wrong data type is requested. DataStore provides a way of catching any exception coming your way when reading or writing data, by relying on Flow’s error signalling mechanism.

Type safety

Using Map key-value pairs for saving and retrieving data doesn’t offer type safety protection. However, with Proto DataStore, you can predefine a schema for your data model and get the additional benefit of full type safety.

Data consistency

SharedPreferences’ lack of atomicity guarantees means that you cannot rely on your data modifications being reflected always and everywhere. This can be dangerous, especially since the whole point of this API is persisted data storage. In comparison, DataStore’s fully transactional API provides strong ACID guarantees, as the data is updated in an atomic read-modify-write operation. It also provides “read after write” consistency, reflecting the fact that all updates that have completed, will be reflected in read values.

Migration support

SharedPreferences doesn’t have a built-in migration mechanism — it is up to you to do some tedious, error prone remapping of values from your old storage to the new one and then cleaning up. All this increases the chances of runtime exceptions, as you could easily run into issues with data type mismatch. DataStore, however, provides a way of easily migrating data into it, along with a provided implementation for SharedPreferences-to-DataStore migration.

Preferences vs Proto DataStore

Now that we’ve seen what benefits DataStore offers over SharedPreferences, let’s talk about how to make a choice between its two implementations — Preferences and Proto DataStore.

Preferences DataStore reads and writes data based on key-value pairs, without defining the schema upfront. While this might sound similar to SharedPreferences, keep in mind all the improvements mentioned above that DataStore brings. Don’t be fooled by their joint use of “Preferences” in naming — these have nothing in common and come from two completely separate APIs.

Proto DataStore stores typed objects, backed by Protocol Buffers, providing type safety and removing the need for keys. Protobufs are faster, smaller, simpler, and less ambiguous than XML and other similar data formats. If you haven’t used them before, fear not! These are very simple to learn. While Proto DataStore does require you to learn a new serialization mechanism, we believe that its advantages, especially type safety, are worth it.

When choosing between the two, you should take into account the following:

If you’re working with key-value pairs for reading and writing data, wish to quickly migrate from SharedPreferences with minimal changes, while still taking advantage of DataStore’s improvements, and feel confident enough without type safety checks, you can go with Preferences DataStore
If you wish to learn protocol buffers for the added benefit of improved readability, if your data requires working with more complex classes, like enums or lists, and you wish to have the full type safety support while doing so, you can try out Proto DataStore

DataStore vs Room

You might ask — “Well, why not just use Room to store my data?”. And that’s a fair question! So, let’s see where Room fits in all this.

If you need to work with complex datasets larger than a few 10s of KBs, it is highly likely you might need partial updates or referential integrity between different data tables. In that case, you should consider using Room.

However, if you’re working with smaller and simpler datasets, like preferences or app states and therefore, do not need partial updates or referential integrity, you should choose DataStore.

How to choose between DataStore and Room

To be continued

We’ve gone into more detail on DataStore — how it works, the changes and improvements it brings and how to decide between its two implementations. In the next two blog posts, we will further discuss Proto and Preferences DataStore — how to create, read, and write data, handle any errors, as well as how to migrate from SharedPreferences. Stay tuned!

You can find all posts from our Jetpack DataStore series here:
Introduction to Jetpack DataStore
All about Preferences DataStore
All about Proto DataStore
DataStore and dependency injection
DataStore and Kotlin serialization
DataStore and synchronous work
DataStore and data migration
DataStore and testing