Get Threading Right with DiffUtil

Use DiffUtil for rich animations in RecyclerView with a tiny amount of code

Now this is a story all about how
My dataset changed, turned upside down
And it might take a second
Just sit right there
I’ll tell you how to DiffUtil on a background thread and apply the results on the main thread correctly

Yeah. That fell apart on me. Seriously though, in my post about making your ViewHolders smarter, I advised that you always use DiffUtil to update your RecyclerView rather than calling notifyDatasetChanged(). For the sake of simplicity, I processed the diff on the main thread. That worked fine because my list had two items in it. But what about 100 items? What about 1,000 items? What about 100,000 items? What if processing the diff takes so long that I drop frames? What if it takes so long that I ANR? How can I keep these bad things from affecting my users?

The answer is to move the diff calculation to the background thread. However, there are some serious gotchas.

The DiffUtil gotchas are

  • When you change your backing data, you must notify your adapter immediately to maintain data consistency
  • You must handle concurrent changes to your data.

How to maintain data consistency

The first item took me an embarrassingly long time to understand. My instinct was to update the backing data for my adapter, then process the diff on a background thread, then notify the adapter back on the main thread. That. Does. Not. Work. What ends up happening is that the user can very easily scroll to a point where the adapter thinks data exists because it hasn’t been notified of the change. So it asks the backing data collection for the 10th item, but the collection only had five items. So you end up crashing with an IndexOutOfBoundsException. Not what you wanted at all.

The important thing to understand is that you must do things in this order:

  1. Process the diff on a background thread
  2. Return to the main thread
  3. Update the backing data
  4. Notify the adapter of the changes.

How to handle concurrent changes

Handling concurrent changes is less straightforward and depends on how your data matters to the user. In any case, you can’t run concurrent updates on the adapter. You must choose between three options:

  1. Process the current update and ignore all others
  2. Stack the updates, and apply only the latest once the current one is finished.
  3. Queue the updates, and apply them in the order received.

So let’s get into the code

Let’s take the DiffUtil from the DumbViewHolder example, and improve it. First, let’s see what happens when we make the diff take a long time to run. We’ll take a really simple approach to that and just introduce a Thread.sleep(random.nextInt(3000))in one of the methods. That will make the diff get stuck for some random amount of time up to three seconds.

Alter a method of DiffUtil to be long running

We’ll create a DiffUtill.Callback subclass for our use called DiffCb, pick one of the methods arbitrarily, and make it slow. So now every time we use this DiffUtil callback, it will take up to 3 seconds to finish.

class DiffCb extends DiffUtil.Callback {
...
@Override
public int getNewListSize() {
// Simulate a really long running diff calculation.
try {
Thread.sleep(new Random().nextInt(3000));
} catch (InterruptedException e) {
e.printStackTrace();
}
return newItems.size();
}
...
}

So let’s use this new callback in an adapter

public class MainThreadAdapter extends RecyclerView.Adapter {
protected List<Item> items = new ArrayList<>();

public void updateItems(List<Item> newItems) {
List<Item> oldItems = new ArrayList<>(items);
DiffUtil.DiffResult diffResult =
DiffUtil.calculateDiff(new DiffCb(oldItems, newItems));
items.clear();
items.addAll(newItems);
diffResult.dispatchUpdatesTo(this);
}
...
}

The results are not pretty. The main thread is blocked. The user cannot interact with the app at all until the diff processing is finished. The user will get impatient and stop using your app. So we have to use a background thread, but here is where data consistency gets tricky.

Keeping our data consistent

On our quest to avoid blocking the main thread, let’s introduce three new methods to our adapter: updateItemsInternal, applyDiffResult, and dispatchUpdates.

In the following class, we use the updateItemsInternal method to kick off a background thread where we can run the diff and get a DiffResult object back. Then we use a Handler to call applyDiffResult back on the main thread. In applyDiffResult we call through to dispatchUpdates. It might seem a little contrived in this example because these methods just call straight through. But in the following examples the abstraction becomes important.

public class ConcurrencyFailAdapter extends RecyclerView.Adapter {
protected List<Item> items = new ArrayList<>();
  // The Fragment or Activity will call this method 
// when new data becomes available
public void
updateItems(final List<Item> newItems) {
updateItemsInternal(newItems);
}
  // This method does the heavy lifting of 
// pushing the work to the background thread
void
updateItemsInternal(final List<Item> newItems) {
final List<Item> oldItems = new ArrayList<>(this.items);
    final Handler handler = new Handler();
new Thread(new Runnable() {
@Override
public void run() {
final DiffUtil.DiffResult diffResult =
DiffUtil.calculateDiff(new DiffCb(oldItems, newItems));
handler.post(new Runnable() {
@Override
public void run() {
applyDiffResult(newItems, diffResult);
}
});
      }
}).start();
}
  // This method is called when the background work is done
protected void
applyDiffResult(List<Item> newItems,
DiffUtil.DiffResult diffResult) {
dispatchUpdates(newItems, diffResult);
}
  // This method does the work of actually updating 
// the backing data and notifying the adapter
protected void
dispatchUpdates(List<Item> newItems,
DiffUtil.DiffResult diffResult) {
diffResult.dispatchUpdatesTo(this);
items.clear();
items.addAll(newItems);
}
...
}

Now we process the diff on the background thread, and we both update our backing data and notify our adapter on the main after everything is done. Our list animates nicely once the diff is finished. We have no crashes. BUT! We have another problem now. What if the list changes again while we’re busy processing the current diff? Yeah… That’s going to be a problem.

Handling concurrent updates

For the next three demonstrations the important work will be done in two methods: updateItems and applyDiffResult. updateItems takes a List of new items to use. It gives us a chance to do any prep work, then it kicks off the background thread by calling updateItemsInternal. When the background thread completes, it calls applyDiffResult passing the new items, and the DiffResult to apply. applyDiffResult lets us do some work, then it dispatches to the adapter by calling dispatchUpdates. dispatchUpdates and updateItemsInternal will all be identical to the ConcurrencyFailAdapter’s implementation. So for the sake of brevity, I’ll omit those as such:

public class ConcurrencyFailAdapter extends RecyclerView.Adapter {
protected List<Item> items = new ArrayList<>();

public void updateItems(final List<Item> newItems) {
updateItemsInternal(newItems);
}
...
protected void applyDiffResult(List<Item> newItems,
DiffUtil.DiffResult diffResult) {
dispatchUpdates(newItems, diffResult);
}
...
}

Let’s start with the simplest approach: I’m busy, leave me alone

We’ll keep track of a boolean variable that lets us know that we’re currently processing an update. If that variable is true, then we’ll just ignore all other updates. This prevents crashes and data inconsistencies, but it probably doesn’t represent what the user actually needs.

public class FirstWinsAdapter extends RecyclerView.Adapter {
protected List<Item> items = new ArrayList<>();
boolean operationPending;

public void updateItems(final List<Item> newItems) {
if (operationPending) {
return;
}
operationPending = true;
updateItemsInternal(newItems);
}
...
protected void applyDiffResult(List<Item> newItems,
DiffUtil.DiffResult diffResult) {
dispatchUpdates(newItems, diffResult);
operationPending = false;
}
...
}

Next approach: Latest update wins

Now this one is probably more useful if the data is entirely server driven. This will finish processing the current update, storing up pending updates that come in. Then once it finishes the current one, it discards all but the most recent update, and processes that one.

public class LatestWinsAdapter extends RecyclerView.Adapter {
protected List<Item> items = new ArrayList<>();
private Deque<List<Item>> pendingUpdates =
new ArrayDeque<>();

public void updateItems(final List<Item> newItems) {
pendingUpdates.push(newItems);
if (pendingUpdates.size() > 1) {
return;
}
updateItemsInternal(newItems);
}
...
protected void applyDiffResult(List<Item> newItems,
DiffUtil.DiffResult diffResult) {
pendingUpdates.remove(newItems);
dispatchUpdates(newItems, diffResult);
if (pendingUpdates.size() > 0) {
List<Item> latest = pendingUpdates.pop();
pendingUpdates.clear();
updateItemsInternal(latest);
}
}
...
}

Final Approach: Queue them up, apply them in order

This one would work well for pagination, or any processing where each update matters, and none should be discarded.

public class QueueAdapter extends RecyclerView.Adapter {
protected List<Item> items = new ArrayList<>();
private Queue<List<Item>> pendingUpdates =
new ArrayDeque<>();

public void updateItems(final List<Item> newItems) {
pendingUpdates.add(newItems);
if (pendingUpdates.size() > 1) {
return;
}
updateItemsInternal(newItems);
}
...
protected void applyDiffResult(List<Item> newItems,
DiffUtil.DiffResult diffResult) {
pendingUpdates.remove();
dispatchUpdates(newItems, diffResult);
if (pendingUpdates.size() > 0) {
updateItemsInternal(pendingUpdates.peek());
}
}
...
}

Now from here, I’m sure we could devise other ideas. We could also use RxJava for thread handling. We could even let Rx do the queueing for us. What other improvements can you think of?