Reactive data flow in Revolut Android app

Published in

Revolut Tech

6 min readNov 29, 2019

Quite a while ago I wrote an article explaining how almost every screen in the Revolut app is done with the RecyclerView.

Today I want to go further and tell you how all this is backed up by our Reactive Data Layer.

Let’s start with a high-level overview of the whole architecture:

One important thing to know about the Revolut app is that it fully supports offline mode, thus there are at least two data sources under the hood of repositories.

But we wanted to keep this implementation detail abstracted away from developers so they wouldn’t need to bother writing this logic in every Interactor or Presenter. So the offline logic can reside in repositories.

However, some abstract knowledge about what’s happening with the data is still required and it can be expressed with the following wrapper class data:

All we need to know about any Data<T>: if loading is now in progress and whether there were any errors

When the loading value is true that implies: “there is a high chance data will be updated soon and that the currently present content value might be outdated”, regardless of a source. The error means that “something went wrong and probably needs some action from user”. Most of the time this is enough for a domain or presentation layer to decide what needs to be rendered.

The Repository contract looks pretty simple, it returns an Observable with domain class wrapped in above-mentioned Data class:

The really important thing is how this Observable behaves which really depends on the offline strategy being used under the hood.

When deciding which offline strategy to implement, the two high-level approaches are — Cache-first or Network-first. Network-first means that cache is only used when a network request fails. Cache-first means that the cache is rendered first, and then network data is only used to update the UI.

Network-first approach is generally easier to implement, but it‘s also a lot less responsive, as network request results are always awaited before UI is fully rendered. Cache-first usually means that UI needs to support some sort of data differentiating, i.e. fluent transitions from data to data, since it will be the default case when data is first rendered, and then some updated data comes from the network. Lucky us, we have the RecyclerView based framework with DiffUtil rendering such transitions.

The old approach where the UI only observes DB and all network responses go to DB first to be observed later falls under the cache-first category. However, we found this approach to be inefficient since DB querying can be quite time-consuming.

So the way we went about it at Revolut was to use 2 levels of cache: memory and DB. Memory is superior to DB — DB is basically only queried once, and the result is put to memory, then DB is only used for writes. Let’s take a look at examples.

One major scenario is when the repository doesn’t contain data in the memory cache. In that case, it needs to query DB first before going to network, as it is much faster than a network call, and we try to populate UI as fast as possible:

As mentioned before, Memory cache is superior to DB cache, so after the DB query, that data is put to Memory cache to be used later during the app session.

Here’s the diagram for Observable emits, when the Repository contains the requested data in the memory cache:

If an item is present in the memory cache, it’s being emitted right after subscription. Then Repository queries the network for the same data and will emit it once it arrives. Remember the forceReload param in the observe method? Normally Repo won’t query the network if memory cache is present. forceReload literally forces Repo to query the network even if memory cache is present.

The important part is that we didn’t come up with this algorithm at once. Different places used slightly different strategies, sometimes skipping cache, or terminating the stream after network data arrival. We didn’t use the same approaches as well — somewhere it was Subjects, other places utilised concat or merge operators, and it all looked like a mess at first, but eventually, we identified the common behaviour that is essentially required from a Repository and decided to wrap it inside some single entity.

Meet DataObservableDelegate

Since the Repository is, in fact, delegating its own responsibility to follow the Data Flow Protocol to this entity, we called it DataObservableDelegate (DOD).

This is public constructor, which allows us to declare where data is taken from and saved to.

Here is one example demonstrating a DOD for Portfolio. We now only have to declare the specific endpoint, lambdas for fetching cache from memory and DB, and lambdas for storing to memory and DB.

And DOD actually supports the partial declaration, e.g. only memory cache, or only DB cache.

Now it will take another line for a repository to expose a single convenient method for the Domain layer to observe Portfolio regardless of a source:

Domain layer has only one lever to control the data: it’s the forceReload flag. The whole algorithm for emitting particular source data might be now drawn as follows:

As DOD are delegates of a repository, they are allowed to talk to each other and to depend on each other. In the case of the portfolio, the mapping to Domain models actually depends on another entity: Config, which is separately being delegated to another DOD. And portfolio also has nested entities: Holdings, which are handled by one another instance of DOD. Let’s take a look at how to handle such cases.

Here the fromNetwork lambda returns a stream flatMapping to the getConfig() method. The stream returned by the fromNetwork is actually allowed to wait for any other events, it just needs to be careful not to introduce any circular dependencies.

Here you can take a look at the getConfig() implementation: it actually just uses another DOD and extracts data and error from its stream (another useful extension that is commonly used in the project).

To deal with nested entities, we can make use of other sets of methods that DOD provides for managing state: updateAll(), and notifyRemoved(). “Update all” means that all lambdas responsible for storing (toStorage, toMemory) will be invoked with corresponding updated values. notifyRemoved() is used to notify about an item already removed in order for DOD to produce Data(content = null).

Tests

The DOD behaviour described above is covered with tests like the one below.

So when new Repositories are implemented normally their tests only need to cover additional logic being introduced in DOD lambdas, like storing nested Holdings in our example.

Conclusion

DataObservableDelegate covers almost all of our needs for providing read-only offline mode. One obvious missing part, for now, is offline writes support. Currently writes or any other changes to the data on the server are only allowed via separate network calls which are responsible for updating the corresponding DOD. But maybe we will explore that area in the future.

And we’re happy to say that we finally outsourced this as a standalone library, so you can check the code in this repository:

https://github.com/revolut-mobile/RxData

Or try it out as a library:

compile 'com.revolut.rxdata:dod:1.0'

Join Revolut

See the newest job openings in our Technology department.

Reactive data flow in Revolut Android app

Meet DataObservableDelegate

Tests

Conclusion

Join Revolut

Careers | Revolut

Banks are powerful. They owe us nothing. They watch us. They punish us for our mistakes. They have more money, more…

Written by Roman Iatcyna