Asynchronously loading data using Google’s Paging Library

The recently released Paging Library from Google gives you an easy way to page data into memory off of the main thread. If you want to use it with Room, then the built-in support makes it trivial. However, if you’d like to page data that exists elsewhere — from the network or disk, for example — then you have to do a little extra work.

Here I demonstrate how to take existing code that loads a list of screenshots from disk and convert it to load asynchronously using the Paging Library. This example could easily be adapted for network calls.

Throughout this post we’ll be using a Screenshot class defined as follows:

You will also need to import the library:

Existing code

When the app launches, it shows the user a list of images from a directory they supply during setup. My existing code was a simple repository that loaded all screenshots (as a Uri and width/height values) from this directory via Android’s storage access framework. The code was similar to the following:

As you can see, we have to go to disk to get information about every screenshot. This is expensive! On my test device I’ve only got around 100 screenshots — on a device that’s been around longer a user could have many hundreds or thousands of screenshots in this folder. Rather than loading these all up front, I need to load these asynchronously. In addition, since these are in a list there’s a high likelihood that the items further down the list won’t even be needed. The Paging Library helps with these problems.

Paging Library Fundamentals

The Paging Library has 3 different “levels” that build on top of each other.

At the base, we have a kind of DataSource. This can be either “keyed” (such that you need to know about the item at index N-1 in order to know about the item at index N — like a linked list), or “tiled” (such that you can access elements at arbitrary indices — like an array list).

Above that, we have a PagedList, which as its name suggests, is a list that pages its data in from a DataSource.

Finally we have the PagedListAdapter, which is a RecyclerView.Adapter that neatly wraps a PagedList, calling the correct notifyItem… methods for you as your data changes (when you call setList with a new PagedList) or loads in. This is fairly standard RecyclerView boilerplate. If you need some custom behaviour, you can duplicate its functionality — it’s just a handy wrapper around the PagedListAdapterHelper.

Constructing a Data Source

A DataSource is reasonably simple. For a list like this, we implement a TiledDataSource, doing the expensive disk IO in the loadRange method. The PagedList will call this from a background thread when it is time to load a new page of data.

Note that it does not clamp ranges for you — so if you have 10 items and a page size of 6, your second page will be startPosition = 6 and count = 6 — which will give you an IndexOutOfBoundsException. Make sure to clamp your inputs as I do here.

Note here the DataSource<Int, Screenshot> — those types are<Key, Value>. Because we’re using a TiledDataSource our key is already defined to be an Int. Later, when we’re creating a PagedList, the builder assumes that a TiledDataSource uses an Int key — which makes sense, as we can access any element in this source by index.

Building a Paged List

For my use-case, 8 elements on a ‘page’ was sufficient. I’m showing images in a 2-column list; most of the elements are around half the window height. You’ll need to decide what works well for you.

Note that here we use the default prefetch distance of pageSize — that is, as soon as the first item in a given page of data is requested, the next page will begin loading. Depending on your data, you may want this to be smaller or larger.

We also enable placeholders (this is actually enabled by default). Since we know exactly how many elements are going to be in our list (we have Uris for every screenshot, even if we don’t have any details about that screenshot yet) we can use null placeholders while the images load — helping avoid weird scrollbars. Our onBindViewHolder has to deal with these null values later. I’d recommend reading the PagedList docs — they go into more detail on placeholders.

You need to supply two Executors — one for posting back to the main thread, and another for background work. In this example we create a main thread Handler and post events to it directly, but our disk IO Executor is injected from elsewhere.

Finally, this code has one subtle gotcha — the first two pages will be loaded immediately on whatever thread the final build() is called from! From the docs:

Creating a PagedList loads data from the DataSource immediately, and should for this reason be done on a background thread. The constructed PagedList may then be passed to and used on the UI thread. This is done to prevent passing a list with no loaded content to the UI thread, which should generally not be presented to the user.

In this case, we’ll be going to disk 16 times (8 for the first page, and then another 8 as the second page is pre-fetched). I address this later when I wire everything together.

You can also specify an initial key to load around — here we don’t specify one, and hence default to 0 (i.e. the first item in the list). If your list starts with an offset you can start your loading around that location — this works for both TiledDataSource and KeyedDataSource.

Adding the Adapter

The simplest part of all. We just extend PagedListAdapter, supplying a simple DiffCallback for comparing Screenshot objects, and implement onBindViewHolder and onCreateViewHolder like normal.

Note if you have custom logic (such as a custom BindingAdapter — not shown here) you need to be aware that the object returned from getItem can be null — these are the placeholders we enabled earlier, and will be null while the data at that index loads. If your page sizes are appropriate, receiving a null object will be rare, but you must handle it.

Wiring it all up

Now we have all the pieces we can put them together. Note that I supply an async callback for loading the screenshots — the ScreenshotLoader uses the disk executor we supply in the constructor to do the actual building (since this loads the first page or more, as mentioned above!). We then set the result on our list once that load has finished.

Gotchas

  • You may have to specify a minHeight on your list items, or otherwise specify the height when a placeholder is used. If you don’t, the adapter will query for (and the list will try to load) your whole list of 0-height items.
  • Don’t forget the first page is loaded when you call PagedList.Builder.build() — do this off the main thread!
  • Experiment with page sizes and prefetch windows. This is entirely dependent on your data.
  • Already loaded objects are not unloaded. If the user scrolls to the bottom of the list, the first items stay in memory. This is potentially a problem if your objects are large, or your list is long. Here I let Picasso load (and unload) the actual memory-intensive bitmaps for me — storing a list of Uris is cheap. This will hopefully be resolved before final release.
  • You must handle null objects returned from getItem if you enable placeholders, even if you never encounter them in your testing.
  • The loadRange call is not bounded to the size of the list; you need to do this yourself. It will happily handle results smaller than the requested count, however (i.e. when you’re at the end of the list).
  • If you’re using LiveData, look into LivePagedListProvider as it will do most of this overhead for you.
  • The library is still in alpha at the time of this writing; the APIs described here could still change before release.

Additional Resources

I initially learned about the Paging Library on the Fragmented Podcast where Florina Muntenescu talked about it with Donn and Kaushik. In addition to talking about the Paging Library, she touches on some other issues with regular Cursor-based queries. It’s well worth a listen. She also covered the library in her talk at GDD Europe.

If you’re looking for more detail on the Paging Library, or your use-case doesn’t quite fit my example here, then the official documentation is always a great place to start.

Finally, I’d recommend looking at the source code directly. I spent a lot of time diving into the source while figuring this out — it’s clear and easy to read, and you can look at it right inside Android Studio.


Originally published at speakman.net.nz on October 9, 2017.