Data Layer Using the Repository Pattern

Introduction

Published in

The Startup

5 min readJul 4, 2020

This article contains a description of how the data layer of an Android application can be implemented using the repository pattern. The application is MoviesPreview, which is an application that allows a user to browse the data exposed by The Movie DB API to see different information about movies, actors/actresses, and some other features. The full code of the project can be found in this Github repository.

Repository Pattern

Countless articles are explaining what the repository pattern is and what are the benefits of using it. There are also a great number of articles that explain different implementations that can be used to respect the pattern and to achieve the biggest benefit that the pattern exposes: abstracting the access to the data layer in a way that the business layer can ignore the sources from where the data is coming.

Three key concepts can help us understand what the repository pattern is and what it does in an architecture:

A repository abstracts the data access: if several data sources are providing data to the application, the repository has the core responsibility of hiding this to the upper layers to facilitate access to that data, independently of where that data is coming from.
If the application is implementing some caching strategy to make the data available to different requests, the repository is the layer where that caching mechanism is implemented.
A repository handles only one type of data (entity). This means that if a given use case needs to access data from three different entities to achieve success, the use case has a dependency on three different repositories: one per each entity — business object that needs to query.

Repository implementation

The data layer in MoviesPreview is implemented using two basic types of data sources:

A remote API that can be used to retrieve the data requested.
The device’s local storage (SQLite, the file system, and/or SharedPreferences).

So, each repository in the application has a dependency on (at the most) two data sources. For the local storage, I use the ‘Db’ suffix and for the remote API, I use the ‘Api’ suffix. For instance: one of the key entities that the application has is ‘MoviePage’ (an abstraction that represents a page of movies). To access that entity, the data layer defines an interface called MoviePageRepository and the implementation of that interface (MoviePageRepositoryImp) has a dependency on two different data sources: MoviePageApi and MoviePageDb.

The application has a caching mechanism implemented in which certain data is stored in the device’s local storage to make it available without the need to perform an API call every time the data is needed. That caching logic is implemented in the repository: when the data is requested, the repository attempts to fetch it from the local storage. If it is not available there, then it attempts to fetch it from the remote data source. When that operation is successful, the repository updates the local storage with the new data obtained.

Local Data Source (caching implementation)

The local data source is implemented in SQLite, using Room as a framework to interact with the database. The local data source is used as a cache in the sense that it is the first data source that checked to see if the data is available locally, saving time and network requests.

To make sure that the data stored locally is up-to-date, each entity is stored in the database with a timestamp that represents the due date of that particular entry. This means that every time the local data source queries the database to fetch an entry, it compares the timestamp stored with the current time. If it is still valid, the entry is returned to the upper layers. Otherwise, it is discarded.

When the entry is discarded because of the timestamp (or any other reason), the repository in charge of making the data available updates the local cache every time new data is available from the remote API.

A middle layer provides mapping functionalities between the domain entities and the database entities: each entity in the domain of the application has a counterpart in the database layer with specific properties that are needed to implement the database language. The architecture provides two basic mappers:

A mapper to map domain entities into database entities.
A mapper to do the reverse mechanism, map from database entities to domain entities.

Remote Data Source

The remote data source is implemented used the widely used Retrofit library. The one thing that needs to be outlined here is that in order to simplify the error handling, the remote data source layer ignores errors and it just returns null values every time an error occurs. The upper layers are the ones with the responsibility of deciding which error type should be shown in the UI.

A note about threading

The architecture uses Kotlin coroutines to handle the background threads that are needed to access the data layer. That is why all methods in the repository layer are marked with the keyword suspend, indicating that those methods are long-running operations that might block the UI thread and they should be used in a background thread.

This is a compromise I have done to simplify the threading technique because I believe that the threading mechanism should be the responsibility of a single layer, but marking the methods as suspended I’m leaking the fact that a multi-threading mechanism should be used to access the layer.