Practical repositories in Android development

Make yourself agnostic: don’t know, don’t care about the data source.

What is a repository?

A repository is an abstraction over a data source. In other words, a class can depend on a repository and stop worrying about whether the data comes from the network, from a cache, or from a database. The class then becomes data source “agnostic” — it neither knows nor cares about the source of the data.

This leads to less coupling as the class doesn’t need to change if the source of the data changes. We then get a better application of the open/closed principle from SOLID — the class is closed for modification with respect to the data source.

Why have a repository?

That’s all very well but how does it help me? Well for a simple app it could easily be overkill and aggressively applying “you ain’t gonna need it” (YAGNI) will let you go without repositories.

You ain’t gonna need it.

However, for a large app, it’s one way to manage the complexity of having multiple data sources. And speaking of large and complex Android apps, the Trade Me Android App has a long history going back to the days of Retrofit 1 and even the cringeworthyAsyncTask. Of course, some progress was made and in recent iterations, the API calls have been direct calls to Retrofit services. But we’ve suffered from not having repositories. Take the following sample code in a ViewModel:

Unfortunately, the return type of the legacy retrieveActivityFeedRx is Observable<Response<ActivityFeed>>. Not only are the nested typed parameters ugly, they’re a pain to mock in a test since we have to wrap a payload inside a successful Retrofit Response:

Note also the direct dependency on the data source has caused the ViewModel to become polluted with logic for dealing with data source artefacts. For instance, the ViewModel has to know about successful and unsuccessful HTTP responses. It also has to know how to extract the body from a successful response. This is further complicated by the fact that the body can be null.

Moreover, the ViewModel now references a class that we do not own, a Retrofit Response. If we had to change from Retrofit to Firebase, or from Retrofit to a yet unknown system, we’d need to make big changes to this class.

And since each consumer of the activity feed in the app is independently making direct calls to the data source, there is no easy way to address an app-wide concern for that data source like caching or filtering.

These criticisms would still apply if we were using a presenter instead of a ViewModel.

The repository pattern to the rescue…

The repository pattern

Applying the pattern to the above, we get a repository like this:

The data source is behind an interface:

And the default implementation looks like this:

Now we make our ViewModel depend on the repository instead of the raw Retrofit data source:

Notice how it’s now easier to write a test in the ViewModel without all of the ceremony of newing up a Retrofit response:

Since we now have a well-defined interface for the data source, we can use the decorator pattern to add extra functionality. A simple example would be to add caching to the data source:

You can see how this pattern can be extended to mix local and remote data, or to filter undesirable results from a hostile API. Because we are using composition, this kind of code reuse doesn’t suffer from the fragile base class syndrome that often results from reuse by inheritance.

UML diagram for a simple decorated repository

How to write a repository

Now we know what a repository is and why we may need it, let’s make sure we don’t repeat the mistakes of the past when we write them. Here are some basic hints for designing your own repositories:

Don’t leak the source of the data in the return type of the methods of the repository

This method leaks the source of the data (a Retrofit Response) in the return type. Consumers should neither know nor care about the source of the data.

Don’t use callbacks

Adding a callback as a parameter is a step backwards since callbacks quickly become unmanageable when they need to be nested. Nested callbacks can quickly degenerate into callback hell.

Return concurrency abstractions

Instead of callbacks, have the repositories return concurrency abstractions likeFuture:

Even if your data source makes it possible to return the data immediately e.g. if you are storing it in SharedPreferences, there still could be a chance that the data could be sourced from somewhere else in the future. In this case, it’s better to err on the side of caution and use a concurrency abstraction in the return types:

Use the most appropriate concurrency abstraction

Most API calls are going to return a single payload, so there’s no point in using Observable unless you have multiple emissions from a data source. Single is the better choice here.

Don’t do processing into a model for the view inside a repository

ViewModel do not manipulate a view like a Fragment directly. Instead, they manipulate a model that the view reacts to. So part of the work for the ViewModel is to take domain entities that might be returned from the API and process them into a model.

Given that ViewModel often become overwhelmingly large, one might be tempted to offload some of the work to the repository and have it expose the model directly. However, if we do this we end up violating separation of concerns since repositories belong to the domain layer, while objects surfaced to the view should be part of the presentation layer.

This advice also applies to repositories in presenters. Presenters normally don’t surface raw domain entities to a view, but instead convert them into something lightweight first. This is a responsibility of the presenter and shouldn’t be delegated to the repository.

Don’t bundle two entities together

You might have to make two disparate API calls to get all of the data for your ViewModel or presenter. In this scenario, it’s better to use flatMap or a similar functional combinator to chain the results of the two calls together in the consumer. If instead you offload this work to the repository, the component becomes less flexible since in the future you may have to change the order of the calls or even add a new entity.

Conclusion

We’ve now covered the what, where, and how of repositories. Someone very smart once said “I for one must be content to remain an agnostic.” For myself, I’ll be content when our presenters and ViewModel become agnostic through proper use of the repository pattern!

About the author

David Rawson is an Android developer at Trade Me. You can find him on Stack Overflow here.