RxRepository: Building a testable, reactive, network data repository using RxSwift (part 1)

6 min readJul 10, 2020

In this series we will tackle the problem of optimizing network access to fetch data from the network, a common theme of networked applications. While it is certainly trivial to fetch data from a server in any modern framework or OS, optimizing the frequency of access to the network, in order to save bandwidth, battery, user frustration, amongst other things, is complex. More so if you want to reduce code duplication, ensure testability, and leave something useful (and comprehensible) for the next engineer to use.

When we decided to implement this RxRepository we had a few goals in mind:

has to offer reactive contracts
no dependencies other than RxSwift and Foundation
at least one layer of cache (be it memory or disk based)
testable
simple to understand and debug

In this first installment we will tackle the definition of our cache system.

This series is made of the following articles:

The Cache protocol and two implementations
The Repository protocol and a reference implementation

This series was inspired after reading this article from Roman Iatcyna. Thanks, Roman Iatcyna, for inspiring and sharing. And congratulations on an awesome job on the Android Revolut app.

A Cache Protocol

Simply put a cache is a concept that can be stated as: a system which enables quick access to data, that is typically usage driven, and that compromises memory (disk, volatile, or other) in favor of the speed with which the data would be accessed if the cache didn't exist.

In this particular case, our cache will enable faster access to data that has been previously retrieved from the network (usage driven) by persisting it using local resources (compromise of memory in favor of speed).

So a cache, in its most bare-bones version, is essentially a stateful object that has two methods: a load method and a save method. The load method returns an instance of a previously saved piece of data, or nothing if no data has been saved previously.

Something like:

CacheProtocol bare-bones

Bear in mind that we want to use this Cache to streamline our application's usage of resources and the user's experience by avoiding network calls. We will cover this topic in Part 3 of this series. We know that network calls typically have parameters, and that different parameter value combinations yield different results from the server. As in, if we do GET /post/1 we will get a different result than if we did GET /post/2.

Our cache needs to be able to cope with this distinction and retrieve the previously saved data according to those parameter values, in this case the id of the post.

A natural evolution of our CacheProtocol would then be:

CacheProtocol definition

OK, now we have something better. Any implementation of this protocol has the contract it needs to be able to do exactly what we intend it to do. Save a piece of data tied to a particular request value, whatever it might be, and retrieve it on demand, in case it exists.

We will revisit the functionality offered by this protocol later. For now it suffices.

A Cache Implementation

Let us make things concrete.

A protocol by itself is nothing more than a contract. We need to somehow fulfill that contract. Let's fulfill this particular contract in a quick-and-dirty way. Let's save things in memory and write:

MemoryCache implementation bare-bones

Here's something that we can use to store data, and retrieve that data on demand.
Notice that we have required R to be compliant with Hashable since we are using a Dictionary.
By using this implementation we may now tackle a concrete network access flow, for instance:

Before sending our request check if there is a cached response for that particular request
Return that cached response if the cache is valid
Send out the request if there is no cached response or if it has expired (more on this later)
Once we get a response from the server, cache that response tied to that particular request

Let's break down this flow into pieces.
The first point is covered. If load() returns nothing, we will need to send the request and wait for a response. If it does return something, we are done.
The second point implies that we are able to answer the question: is this cached response valid? So we need to come up with a way to assert the validity of a cached response.
The third point implies nothing of our cache system, it is out of our scope.
The fourth point is covered by the save() method.

In order to assert the validity of cached responses, one can think of decorating a response with information about when does a response expire:

CacheEntry definition

Good, now we are on to something.
Let's revisit our implementation of MemoryCache and add these cache validity checks to it.

MemoryCache with CacheEntry

Moving forward, and as an example, since what we are creating here is infrastructure code, we cannot overlook the importance of tests.

It is trivial to code these kinds of concepts using a TDD approach. In fact, the whole library was written using that approach. At the end of this article you can find a link to a Playground that contains all the code in this article.
Thank you John Sundell for writing this up.

Here are several tests on top of our simple MemoryCache implementation.

Simple tests around MemoryCache

We have something concrete that stores data tied to a request, asserts its validity, and that respects a known contract. And we have written tests to cover the functionality.

Without trying too hard, one can think of at least one more cache implementation. A DiskCache, which stores the cache items on disk.

The fact that the items can be saved to disk brings a special problem to the table. We need to serialize the CacheItems, or in other words, we need to write and read them from disk. Foundation has a protocol just for that, that I assume you are familiar with, the Codable protocol.

Let's factor this new reality into our system by writing a simple implementation of DiskCache:

DiskCache implementation

If you are familiar with Codable then the implementation of DiskCache should be trivial to understand.
You might have noticed that we have required R to be compliant with Request. This is because we need to get different filenames for different requests. Each item is stored in a file whose filename is provided by the Request, this way we can retrieve the correct file, if there is one, for a particular Request. and also obtain the filename of the file into which we will write the CacheItem.
Request is defined in the following manner:

Request protocol

Here are a couple of test cases for our DiskCache

Simple tests around DiskCache

Right now we have the basic building blocks of the cache that will be used in RxRepository.
There are a couple more details we need to be aware of before we use this cache in a real application. We will cover those details as we deal with the details of building the RxRepository.
If you are curious about what details are those:

consider multiple threads accessing the memory cache, for instance, what will happen in the critical zones of our cache?
how well does this implementation fare in terms of memory consumption?
since we are storing files on disk, what should we consider regarding serializing sensitive user data (authorization tokens, private user information, etc.)?
Swift's JSON serialization infrastructure contains a serious problem when decoding JSON numbers into Decimals. How can we solve that issue here?
What tests are we missing in our example Tests?

The full code of this article is available in this playground.

Check out the next article in this series where we will start building the RxRepository on top of this infrastructure.

RxRepository: Building a testable, reactive, network data repository using RxSwift (part 1)

A Cache Protocol

A Cache Implementation

Written by Tiago Janela