OneView — Client
In series, How we created OneView for Deutsche Telekom’s OneApp, Nikhil told us about how OneView for DT one app came into being. How we were able to solve the problem of having a central data lake and how by leveraging MQTT protocol and elastic search we were able to create multiple dashboards for our needs
The primary source of these data lakes is mobile apps and events they generate. So, the first and foremost task was to devise a solution to fill our central data lake. And to build in a way that is lightweight, reliable in terms of information sent, real-time, supports offline capability, doesn’t drain phone’s battery and is secure.
Yeah, those are a lot of asks in one sentence for developers. But we were able to achieve all this. How did we do this? Were there any challenges? Let’s discuss those in this post.
Let's start with challenges thrown by the product team?
➤ All events should be persistent in the app until uploaded.
➤ Should work for both online-offline.
➤ The sequence of all the events should be maintained.
➤ Should not drain the device’s battery.
➤ APK size should not bloat.
What we heard as developers :
➤ Persistent storage
➤ Atomicity
➤ Sequencing (FIFO)
➤ Sockets
➤ Lightweight processes
Our solution consists of three parts: UI, OneView Client, and Network.
UI sends events and information to OneView Client. OneView Client then stores and structure data and sends it to the Network layer when queried.
Let's dive into details of these sections starting from the end
Network
For network and communication with servers, we used MQTT protocol. It is relatively easy to implement on the client and is lightweight. Our use cases involved a lot of events needs to be transferred to servers in some seconds to provide real-time monitoring of events.
Why MQTT?
Well, it’s a long topic and may well deserve another blog post. But for context sake, let me explain this briefly why did we choose MQTT over HTTP protocol.
HTTP is based on client-server architecture and undoubtedly the most popular and widely used protocol. While HTTP can be extended and a strong contender for us, it is document-centric by heart and uses heavy resources when running in mobile devices.
For our requirements, which include lightweight and resource-constraint environments we needed to explore some better options.
Enters MQTT. It's a lightweight protocol based on pub-sub architecture and designed to work for resource-constrained devices. MQTT is data-centric and when using a single connection to send multiple messages, throughput of MQTT is many times higher than that of HTTP’s.
Let’s draw this picture via a chart. (taken from an experiment done by google guys)
The above diagram sums up the average amount of data transmitted per message for different numbers of messages transmitted over the same connection. The full experiment is documented here.
Some key features of MQTT
➤ MQTT is a messaging protocol and messages are transferred on TCP/IP.
➤ MQTT had been developed to be used in low bandwidth and high latency network environments.
➤ MQTT has been evolved to implement Publish/Subscribe messaging pattern to allow decoupled applications to run independently over the network and enable scale-able solutions.
➤ MQTT supports TLS for the secure transmission of information
➤ MQTT supports three levels of Quality of service (QoS) for data integrity
MQTT protocol fits right in with our requirements. The protocol is lightweight, supports TLS security, the data transfer rate is quite high once the connection gets established. We were able to control the quality of service as well with our centrally deployed CMS settings.
Our product manager doesn’t seem tense anymore. Let’s see if we can make him smile.
OneView Client
While MQTT solves much of our network problem, we still have to handle client-side complexities of persistence, atomicity, and sequencing.
For our challenges, we looked upon Queue for the solution. A simple FIFO queue is a relatively easy concept and in-memory queue is equally easy to implement.
With FIFO queue, we were assured about sequencing as it maintains it inherently, with push and pop done at separate ends. This means that insertion and upload logic can run parallelly without much hassle to manage. Queue solves some cases but lacks an important one. — Persistence
In-memory queue doesn’t last process kill. So, we needed a system that is durable and can survive the process lifecycle. Mobile apps are susceptible to process kill where either system can kill your app or user can.
So, we needed something which can persist data on the device which in turn will automatically ensure online-offline switches. Let’s explore what we already have in store for persistence storage and if those suits our needs.
The traditional method of storing data in android device
1. Shared preferences: Shared preferences are designed to store small key-value pairs in android. It is the simplest way to store data in android. Shared preferences are managed by the system where all the writes are first stored in a map which is in-memory cache then to a xml file that is stored in the disk. Since all the disk errors are swallowed by the system, it can leave in-memory cache and disk out of sync and hence not durable.
2. Simple File: Simplest and traditional way of storing data in disk is a file. It is easy for a file to run into corrupted behavior. Data redundancy, duplicity, inconsistency have always been some core issues of data organization in a file. However, a lot of this depends upon the implementation and how the developer manages to write them off.
3. Database: Local database in android (SQLite) is what first strikes in mind when someone asks for persistent, atomic and transactional operations. SQLite works great for large data sets and has optimized query for them. While it fits some of our requirements, we were skeptical to use it for frequent transactions we have in our app for every event we put in.
Tape comes for rescue
While database storage seems the most prominent solution among others to achieve atomicity, durability, and sequencing. Tape library (developed by square) here comes with all the said things minus complexity of transactional overhead. It is based on a file-based storage system with some added pinch of atomicity
Let’s see how it does what It does.
“You can’t spell ‘developer’ without ‘devel’.” — John meyer
Tape or Queue file uses Random Access File object of java with a backing file on disk to store data. The beauty of Queue file lies in a 16 bytes header which it uses to maintain queue transactions.
This header has four parts
1. File length
2. Element count
3. First item position
4. Last element position
This header acts as the registry of all the nodes to be added in queue. Node has no meaning in queue unless they register themselves in this header.
File length, number of elements in queue, first and last element gets updated after write operation. The header maintains the atomicity of queue and prevents it from getting corrupted.
For every write operation of node, two things happen
1. The length of data and data gets written in the file.
2. Update the header of the file.
First part writes data of node in the file along with its size. It has no meaning yet for the queue. We need to update and register this node in header of QueueFile, which is second part of process.
If both tasks complete, meaning after writing data into the file, the header has also been updated successfully. Then count of nodes, first element position and last element position in header changes accordingly. (Fig-1)
But, if the system is unable to write data of node into the file for any reason, the header won’t be committed, and our queue will remain in previous state and node won’t be added to the queue(Fig-2). Data may well be in disk but not in the queue.
In cases, when write operation of node completes but updating header has failed. And since node couldn't register itself, queue remains the same as before and all the changes are aborted.
For successful addition in the queue, both operations need to complete.
To make header persistent, the first 16 bytes of file will always be acquired by the header. Nodes of Queue starts after that. So, when the next time application opens, it can read the last header state and validate the queue accordingly.
To improve multithreading and get to use it with RxJava (which we adore), we have created our own wrapper around it and made use of Reentrant locks to ensure read and writes are synchronized, and data remains consistent. This also protects us from Concurrent Modification types of exceptions in a file.
OneView Client also has an Uploader, which schedules uploading and peeks data from queue whenever required. Conditions for upload process and dispatching are all centrally controlled by the business via CMS.
UI
The third part of our solution is the UI. Following our culture of clean architecture, we have intentionally made UI dumb and agnostic to underlying architecture and logics for data storage and persistence. It has one job though, which is to keep posting data to OneView Client and let it handle everything else.
With the combination of MQTT and Tape, we were able to achieve what we wanted. Let’s see the final checklist and product manager’s face.
✔︎ All events should be persistent in the app until uploaded.
✔︎ Should work for both online-offline.
✔︎ The sequence of all the events should be maintained.
✔︎ Should not drain the device’s battery.
✔︎ APK size should not bloat.