Android, MVVM, and Repositories in the Real World

Mike DeMaso
DraftKings Engineering
11 min readNov 2, 2020

How do you achieve a responsive user interface while keeping your code clean, precise, readable, and maintainable? This is a question that has been bouncing around the brains of Android developers since 2008.

After many years of letting Android developers figure it out on their own, Google has recently given some guidance on the topic with their Guide to app architecture that promotes a variant of MVVM. While this is a great start, it leaves a lot of questions lingering for teams to answer on their own. Google also provides libraries like LiveData, Room, and DataBinding to help us reduce boilerplates and separate our code’s concerns. Even third parties have lent a helping hand with RxJava and Retrofit, which gives us an easy way to handle asynchronous work and fetch things over the network. With all of these things floating around solving small pieces of the larger question, how can we bring them all together to provide a fluid UI experience that is simple to implement and maintain?

In this article, we will share what we have learned from our earlier attempts at creating an MVVM architecture and show a simple example of how to bring all of these pieces of the puzzle together.

A Quick Description of MVVM

To provide you with an understanding of what a repository is going to solve, let’s first dive into the basic MVVM pattern.

The basic MVVM pattern
Basic MVVM Diagram

The View

This is the representation of the Graphical User Interface for your code. These can be represented by XML (layout files) or code (Jetpack Compose). Usually, there is some form of data binding that links the View to the ViewModel. Depending on where you draw the line between a view and a binder, the Activity and Fragment object can be considered either or both.

The ViewModel

This is in charge of transforming data from the Model into a format that makes sense for the View to display. Don’t let that double arrow-ended connection between the View and the ViewModel fool you. The major difference between a ViewModel and the Presenter in MVP is a ViewModel doesn't contain a reference to the view. It is a one-way relationship, meaning somethings else needs to manage the ViewModel object.

The Model

This refers to the data (or domain model) that is being used as the source of the information the ViewModel is providing to the View. This layer is where things get a little blurry as some articles refer to it as the data itself while others refer to it as a data access layer. This is where Google’s repository pattern steps in to clear things up.

Enter the Repository

Google has been mentioning this pattern for some time now. Their examples are a great guide to understanding the basic principle of using MVVM with a repository, but we find they are missing some small (and important) guidance to help people translate these snippets into a larger, more complex project.

The basic Google MVVM Architecture Diagram
Google’s MVVM Diagram

The repository pattern is designed to “provide a clean API so that the rest of the app can retrieve this data easily.” Unfortunately, just adding a repository to your architecture doesn't force your code to be clean. You can still create a tangled mess without properly separating and providing a clear contract between the layers.

At DraftKings we have focused on a few additional guidelines to help our developers produce clean code consistently.

Decouple Layers with Interfaces

Here we are adding in Interfaces to prompt engineers to think about what they are publicly exposing
Improving on Google’s Diagram by adding Interfaces

We established that using interfaces between these layers will help engineers of all levels stick to good coding principles. This helps make sure our unit tests are truly only testing one layer at a time, reducing the overhead of writing and maintaining a large testing suite. Also, it helps to clearly define external facing APIs and obfuscates the implementation details of the different layers. It prompts us to evaluate what we are telling the outside world about this object’s functionality and gives us a built-in opportunity to ensure our code is clean.

It also affords us the ability to refactor different layers in our codebase more effectively. As long as the interface doesn't change, we can leave the areas of our codebase that use them untouched. For example, if we wanted to migrate our networking library from Volley to Retrofit, we could simply change the methods in our Dagger classes that produce and provide the interface above the Remote Data Source and not have to make changes in every repository that uses that endpoint. This vastly reduces the scope of such changes and decreases the chance we would introduce bugs in the final product.

Here we have an example of how letting a ViewModel hold on to the concrete class of a repository can lead to unintended behaviors. The example is a bit contrived and can be rectified by simply marking the fooPublishSubject in FooRepository as private, but that solution is more brittle. FooRepository might need to be used in a different scope requiring access to that parameter, and opening up access for instances now muddles when it is appropriate to use that member variable directly.

Delivering these Dependencies

As the complexity of your project grows, the more complicated your dependency relationships become. This means people generally turn to some sort of Dependency Injection library (like Dagger or Koin).

Not only does a DI library provide a clean and easy way to retrieve your required dependencies, it also allows you to think about how many of these objects you’ll need in the application.

This thought process led us to establish the best practice of what objects belong in the Dagger graph. Anything that we only want a single instance of, should live at the root/global/application-scoped graph. Anything that there could be many instances of, should be created on-demand and held onto appropriately.

This meant that our new repository objects belong in the Dagger graph as we want multiple ViewModels to be able to access Models via a single instance of the underlying Room or Retrofit sources. ViewModels on the other hand need to be created new for each View that needs them. Think of a stack of Activities like a stack of playing cards and the ViewModel drives the suit and value. We wouldn't want the act of adding a 3 of clubs to the top to change all of the cards below to a 3 of clubs as well, so each View needs its own instance of a ViewModel to preserve and isolate its data.

We have now defined what our DI solution will now hold.
Show what objects are expected to be held by the Dagger Graph

We decided to keep our ViewModels out of our Dagger graph. Historically, we had been less explicit about this choice but we felt this is the right direction given the ViewModelProvider pattern that comes in androidx.lifecycle and how it helps us solidify the relationship between the Activity/Fragment/XML and the ViewModel as “one to one”. Meanwhile, the ViewModel to repository relationship can be “many to many”. In practice, this means that for every Activity/Fragment/XML we have a single ViewModel that handles that view’s logic, but it can reach out to many repositories to source the data required. As data is generally reused and displayed across the application, many different ViewModels can use the same instance of the repository from the Dagger Graph easily and efficiently.

Containing the API

In any company at scale, it takes many engineers, teams, and even divisions to get a project from the whiteboard to the customer’s hands. Here at DraftKings, that is no different. To get data into the Android application, we need to work with a few different teams on the backend to get the data from the database to the API to the Android client. Given that this code is often owned by another team, it means that the backend will generally “move” at a different pace than the client.

This is especially true at the start of a project that doesn’t have an API in a state that we can use for our development. We’d like to make design and implementation decisions about the data objects being passed around internally to the client without worrying too much about conflicting decisions that the engineers working on the backend make.

Beyond that, we have a few services that return the same business data to the client but because they are owned by different teams and are trying to solve different problems, they have drifted apart from one another in the structure and types of data returned via the APIs. Internal to the client, these actually represent the same thing, so being able to translate and combine these responses into something universal to the client makes a lot of sense.

Here, you can see that there are two endpoints on the API that return variations of a “user;” the Login and Profile endpoints. To merge them into one data type, we simply create a constructor for each variation we want to handle and now the Application can limit the knowledge of the two different types of users that the API delivers to a single place. This makes sharing data (via the Model layer) between features much easier while still allowing for changes in the API’s data structure and types to be limited to the one endpoint.

We are making a distinction between Network Response objects and Business Model objects in our architecture. This also helps define the role of the Remote Data Source to take Network Responses and transform them into Business Models for the rest of our application.

Defining the data types that these layers produce, this also helps define the role of the Remote Data Source
Clarifying the type of data being sent between the layers, enabling more reuse

An Example in Code

Now we are going to dive into the Google UserRepository example and create our own version sticking to the changes we have made above.

First, let’s take a look at Google’s final version of the UserRepository.

You can see that they are using Dagger (or Hilt), Kotlin Coroutines, Room, and most likely, Retrofit. Providing the Retrofit service, Coroutine executor, and Dao (Data Access Object) to the repository.

The basic flow of this is to make a network request for the data and return the data (or something that is watching for data) from Room immediately. Once the network request completes, do anything you need to do to the data and insert the new object into the table. The insertion automatically notifies the previously returned data that it has changed, prompting a new retrieval, and finally an update of the view.

Some Setup

Before we get to creating the UserRepository, we should first address some things we are going to need, like a helpful way of injecting which threads we want to run on.

This is to help us with testing later on. Set this up in the Dagger graph and now you can inject the correct threads easily across the entire codebase while still being able to swap them out for a TestScheduler in your unit tests (you are writing unit tests… right?)

Here are our user classes, UserResponse being the one returned by our API via Retrofit and User, our business model we pass around internally. You can see that we can make the network object simple and even use a code generator to create it while our business objects can be more in line with what we need.

Here we are defining our Retrofit service. We could have had these return an Observable instead of a Single, which would have made downstream logic a bit simpler but we liked the parallels of how network requests and Single work, both asynchronous and either succeeding or failing. We carry that logic up through the Remote Data Source layer as well.

Next up is our Room Dao. Since Room already works off of interfaces and annotations, we didn’t feel the need to create another interface, class, and object to obfuscate how we handle persistent data.

We are using an Observable to react to emissions of User objects and having our insert action return a Completable to help us handle any Room errors that may occur.

Finally, here is our last interface for the UserRepository itself. It's very simple and clean. The only additional part beyond what is required is the onCleared() method that will help us clear up any existing disposables in our lower layers as the ViewModel gets cleared as well.

Implementations

You’ll notice the constructor’s signature is very similar to Google’s example above. We are providing an object that can retrieve data over the network, a Dao, and an object that tells us what threads to run on.

Even the getUser(userId: Int) method is similar in how it works in the example above. Create an asynchronous network request and immediately return an object that is watching for data from the database. When that network request completes, insert that data into the table, which will trigger a UI update.

You will notice that we are not doing anything too fancy with RxJava here. We create a new PublishSubject that we funnel our data through. We merge it with the persistent layer’s select, which returns an Observable and we also are sending errors from the network layer there as well. We went back and forth on how to handle getting only the errors we want from each layer to the layer above, and this was the simplest and most customizable solution we came to. Yes, it creates extra observables, but it gives us finer control over error handling.

The last piece of the puzzle is the Remote Data Source. Here you can see how we are converting the UserResponse to a User, allowing the rest of the codebase to move at a different pace than the data coming in from the API.

Wrapping It All Up

At this point, you should add these objects from above to the Dagger graph. Remember the provided methods should return the interface type and not the implementations. This way you can build every object defined above in the Dagger graph and that makes it very easy to get data from a UserRepository anywhere in the codebase.

Finally, because we are using RxJava, we should make sure to call clear() on any Disposable objects in our ViewModel and also call onCleared() on our UserRepository instance to clean up any internal Disposable objects.

Now we have a fairly straightforward MVVM implementation with a repository that is provided to us via our Dagger graph. The layers we have created each serve a clear purpose and are held onto by interfaces that help us filter what we expose to the outside world. This also enables us to test these layers independently from each other in isolation using an RxJava TestScheduler that lets us have complete control over the test’s execution.

--

--