RxRepository: Building a testable, reactive, network data repository using RxSwift (part 2)

8 min readJul 17, 2020

In part 1 of this series we started tackling a common problem of networked applications, that of optimizing resource usage and user experience, by optimizing network access. We typically do that by avoiding expensive resource usage, as in avoid making network calls. This avoidance is not more than a mere compromise on the type of resource we decide to spare. Trade a network call for memory space, by caching network responses. It also comes with constraint relaxation, as we do not need the latest version of a particular resource. We, thus, avoid a network call. Nevertheless we want that what we have cached to eventually expire, or to be able to forcefully reload a resource.

In this article we will leverage what we have attained so far and start implementing the exposed RxRepository. This RxRepository can be summarized as a piece of self-contained logic that is capable of serving clients with resources that are loaded asynchronously. The logic of the repository is smart enough to know if a network call is warranted because it caches previous responses to requests and the contract it offers allows clients to indicate their requirement regarding the provenance of the responses (cache, network, or indifferent).
This repository shall respect a set of constraints:

it must be testable
it must delegate the actual network call to clients (as in, be extensible, it can even be another type of call, but for our purposes it will be a network call)
the caching system should be based on disk and on volatile memory, and it should optimize time consumption (read from DiskCache, if available, and cache in MemoryCache, from there on always serve from MemoryCache)
it should allow clients to decide if they prefer cached responses, require cached responses, or require cache ignorance
it should offer a reactive contract

First and foremost, let’s discuss the reactive contract. What do we mean by that? And why do we want that? Let’s use a concrete, but somewhat complex, example to illustrate the benefits of a reactive approach.

Picture the following. ViewControllerA displays a list of Resource1 items where PropertyX is displayed on the cells. The user is able to select any item of that list and is taken to ViewControllerB where the detail of the selected Resource1 item is shown. The user can edit PropertyX of Resource1 in this view controller. After pressing the save button the new version of Resource1 must propagate to ViewControllerA.

Well, we want to optimize resource usage.
And we want to propagate the latest version of the resources to everybody that has any interest in them. We also want to be able to compose complex logic on top of these resources.

This can be done in numerous ways. One can set up delegation, so that ViewControllerB notifies delegate(s) that Resource1 has changed. Another would be to use NotificationCenter and post a notification with the new version of the item. Another would be to use Key-Value Observing. Or even CoreData and then use CoreData's notifications for the purpose. Or use Realm, exempli gratia, and have it automatically do this for you.
All of these strategies require a certain amount of boilerplate code to be integrated. And, granted, some might be complex but they also offer powerful functionality. Most of the times, though, some are simply cumbersome and repetitive (delegation, notifications, and KVO) and others might be overkill and a black-hole of maintenance effort (CoreData or Realm) like using an airplane to climb up a flight of stairs.

The bottom-line is that this type of problem is tough to handle in an elegant way. We have to choose between having boilerplate code spread all over the place, or invest heavily in complex frameworks, such as CoreData or Realm, to do part of the heavy lifting for us.

Let's imagine that we live in an ideal place, though.
From my point of view, an ideal solution, would have the following shape, no more, no less:

At ViewControllerA indicates that it's interested in a set of Resource1 items and whichever updates happen on those items. Then it uses these updates to present the items in a UITableView.
At ViewControllerB save the changes of the edited Resource1 item and have those changes propagate automatically to ViewControllerA. Since ViewControllerA will process the list updates, things will simply work.

Let’s see how one can do that using RxSwift, and nothing more, and nothing less, and model the RxRepository so it matches our needs.

First thing is to model what we would like to offer clients. From the description above, that means loading a list of Resource1 items for ViewControllerA, and saving a Resource1 item for ViewControllerB.

ViewControllerA, in our case, needs to load a list of Resource1 items, right?
That’s func load(request: R) -> T.
Clients need to indicate cache preference.
OK, func load(cachePolicy: CachePolicy, request: R) -> T.
And, it needs to be notified if the list of Resource1 changes.
Fair, func load(cachePolicy: CachePolicy, request: R) -> Observable<T>.

For ViewControllerB we need to save the changes made to the Resource1 item.
OK, that's func save(request: R, item: T).
We want a reactive contract.
Then: func save(request: R, item: T) -> Completable.

Which brings us to the following RxRepositoryProtocol:

RxRepositoryProtocol definition

Let's unpack a few things:

What is CachePolicy?
What should our func load() do?
What should our func save() do?
How will we cope with load() returning an [T] and save() saving a T?

CachePolicy

One can model the CachePolicy as an enum that contains 3 cases, according to what we modeled above:

CachePolicy definition

The load() function has this in its signature, so it's up to the implementation to cope with whatever values are passed.

load()

Our load() method is responsible for setting up an Observable<T> and propagating changes to underlying Ts to clients.

save()

The save() method is responsible for updating whatever cached data we have, furthermore, it must ensure that the Observables returned by load() emit the updated items. This is a tricky method, as we shall see later.

Thoughts

If you compare this protocol to the CacheProtocol we came up with in the previous article, you'll notice that they are very similar. In fact, the only differences are: our RxRepository's load() method contains a CachePolicy parameter, our load() method returns an Observable<T> instead of a T, and our save method returns a Completable. It's almost as if this wraps our stateful *Cache implementations with a reactive contract. Exactly what we needed.

Let's move forward with the first implementation of this RxRepositoryProtocol.
Circling back to our goals, remember, we want to hide the complexity of the decision that needs to be made when deciding whether a particular network request should be sent, and then, propagating the response to whoever is interested in it.

In a nutshell:

Our ViewControllerA requests the list of Resource1 items to our Resource1Repository
Our Resource1Repository will check if there is a cached response for this request.
If there is one and it has not expired, then it returns it, which means it emits on an Observable<T>
If not then it issues a network request and caches the response, emitting on an Observable<T>
ViewControllerA receives the updates of the Resource1 list, including future saves made by ViewControllerB

I believe we have enough to model something concrete.

RxNetworkRepository definition

You'll notice that this implementation still has an abstract method. The actual network call that produces fresh results should not be the responsibility of this class, but that of a subclass of this class. There are multiple approaches to solve this problem, inheritance, composition, etc. We are using inheritance.

Let's take a moment and criticize this.

What happens if two clients request a load, for the same Request in quick succession? Right now two calls will be executed. If more loads are requested in concurrency, more calls are made.
What happens to previous subscribers when we call load from another client for the same Request?
This repository is not really caching anything.

Let's tackle each point.

Concurrency

Concurrency can, typically, open up an avenue for hard problems to replicate, let alone solve. Luckily RxSwift and the way it executes things at runtime has an elegant way of dealing with this. Schedulers.

To make sure that there is only one thread at a time executing code in the load and save we can have a serial scheduler where we subscribeOn() and where we observeOn() in the context of our repository.

A solution to this problem is tied to the next point, the previous subscribers.

Previous Subscribers

How can we, then, notify previous subscribers of new versions of the resources? We need a special type of Observable where we can control the events it emits. We need something that’s known in Rx as a Subject.
There are multiple types of Subjects. We will use a ReplaySubject.
A ReplaySubject is simply a type of Observable that replays previous events when a client subscribes. In order to learn more about Subjects and the differences between the types of Subjects have a read here.

Let’s factor this in our RxRepositoryNetwork:

RxNetworkRepository iteration with Subjects and Scheduler

Cache

This repository isn’t caching anything. This is simply going to the network and delivering results to clients.
If there was only a way to have this neatly tied into our existing infrastructure, right?
But wait, what if we compose several instances of RxRepository? What if we create a RxRepositoryMemory that uses our MemoryCache, and then create a RxRepositoryComposite that has logic to cascade invocations of load() on our RxRepositoryMemory and on our RxRepositoryNetwork?

That is very well possible. Let’s give it a try.

We need to come up with a couple of definitions first. We need to add semantics to the return values of load() -> Observable<T>, so that we can decide, based on our CachePolicy, what happened on a particular RxRepository. Picture this, we try to load request1 from RxRepositoryMemory with a .cacheElseLoad policy. If request1 is not cached in the RxRepositoryMemory this load method should indicate that there is no value. This is different of returning nil, or emitting an error on the Observable<T>. We can define something like:

RxRepositoryLoadResult

To get these semantics tied in to the values emitted by our load's returned Observable.

We would also like to reuse our RxRepositoryProtocol in our RxRepositoryComposite, so we will create a RxRepositoryBase to accommodate this. First, let's change our RxRepositoryProtocol to include the RxRepositoryLoadResult:

RxRepositoryProtocol with RxRepositoryLoadResult

Now let's model our RxRepositoryBase:

RxRepositoryBase definition

Let's take this and define our RxRepositoryMemory:

RxRepositoryMemory definition

Finally, let's create our RxRepositoryComposite:

RxRepositoryComposite definition

Looking Back

We are now at a point where we proved that the concept we set out to prove is, indeed, possible, and we have a reference implementation to back up that claim.
Let's address our requirement of testability next.

Tests

Let's see how we can test our RxRepositoryComposite. Since we are using RxSwift, I recommend reading this excellent article by Shai Mishali to find out the basics on how to write tests for RxSwift code.

These tests use RxNimble and RxTest. The latter is part of RxSwift.

These tests use an evolved version of our composite repository that uses a disk repository as well.

RxRepositoryComposite tests

Conclusion

In this part we have leveraged what we had built in part 1 and managed to compose a Repository that satisfies the constraints we defined at the beginning of this part.

In part 3 we will criticize the infrastructure we have so far, by integrating it in a more concrete example, and try to understand how well this performs in a more close to real life usage. Up until now we have been in wonderland, and we all know that at some point in time we need to set our feet back in the ground.