A Technical Platform to Support Rapid Feedback Services

Published in

Patient Engagement Lab

9 min readOct 18, 2019

Authors: Charles Copley, Eli Grant, Monika Obrocka, Bearnard Hibbins, Nathan Begbie, Themba Gqaza, Susan Nzenze, Martin Odendaal, Erik Harding

Rapid feedback services have become possible in the public health sphere with the advent of digital platforms. Previously, the potential to respond to user behaviour and feedback in health system was limited by the cost and time involved in data collection, however, with digital platforms data collection can be automated and done in near real-time with low associated financial costs. Taking advantage of this opportunity and building continual improvement mechanisms that evolve to keep pace with user preferences are an essential component in reducing costs and making national health systems financially viable. Rapid Feedback frameworks, in turn, are an essential consideration in this.

This blog outlines a suite of components to support integrating rapid feedback into an operational system. These include:

Scalability considerations
Frameworks to support varied experimental designs,
Data protection considerations,
Data visualisation in real-time,
Rapid data analysis architectures,
Artificial intelligence

Requirements

Scalability

Any rapid feedback service implemented on a digital platform has a number of technical requirements to support it. For example, such platforms need to consider the underlying compute and hosting systems; these need to be architectured in a way that is robust, maintainable, and can be scaled to provide additional resources if and when the platform scales in size.

Flexibility to Support for multiple experimental design frameworks

A system with capacity for continual improvement should be able to support a range of different experimental designs. A simple A/B comparison system at the individual level is not sufficient for a health system, since outcome data are often collected at the facility level. A range of different experimental designs should be possible including (but not limited to) simple random assignment across the entire population, cluster randomization by geography, varied inclusion criteria etc. Certain adaptive experimental designs (e.g. Sequential Multi Assignment Randomized Trial — SMART) have requirements for dynamic and continued ra-assignment of individuals to different treatment conditions throughout an experiment.

Technical Support for ‘Nudges’

Any systems designed to support experiments in behaviour change need to provide technical capacity for evidence-based ‘nudges’. Typically these may involve time-based reminders, financial incentives (e.g. airtime, data or mobile-wallet top ups).

Operational Machine Learning and Artificial Intelligence

A successful production Machine Learning model improves through iteration. Thus, the system shall apply machine learning algorithms into production in a stable, scalable and reliable way. That includes easy access to exploratory data analysis, ability to run large-scale queries on data, training algorithms on dedicated hardware and serve models through iteration and regular production releases using agile technologies.

Data Protection

In health space, data sovereignty is a key constraint. This makes cloud based servers unfeasible in many countries, and bespoke systems need to be put in place. In addition legal frameworks and processes need to be put in place to protect the rights of patients on the platform.

Real-time Monitoring

Real-time monitoring is critical to the success of any program. This can take the form of dashboards for certain processes, but other processes may require automated alerts, or even possibly actions to be taken automatically.

Capacity to support varied modes of User Engagement Interfaces

Digital health communication modes are many and varied, and are a function of income levels, geography and viral digital effects. In low-middle income countries, cost (both upfront in the form of the handset/PC purchase and continuing in the form of data purchases) is a primary driver of decision making. Users are often extremely price sensitive with anecdotal evidence suggesting that many users will switch out sim cards in their phones to take advantage of times-of-day price differentials. Up until recently this has meant that SMS/text based communication was the most utilised, since the cost of handsets was low resulting in high penetration and the costs of the communication can be entirely borne by the provider including reverse-billing for replies. More recently, handset costs have dropped significantly making IP messaging systems (e.g. WhatsApp, Telegram, Facebook Messenger, Android RCS) more viable from a penetration perspective. Any forward looking system should provide support for these communication modes.

Deployed System

Scalable technical system

A block diagram of the system is given below. In this section we describe the different components.

Platform hosting

The rapid feedback framework is hosted on a cluster of virtual machines (VMs). This approach allows for a system that can recover easily from hardware failure and has the added benefit of simplifying redeployment. We currently run a Mesos (DC/OS) with Marathon as the orchestration engine, however we plan to move to Kubernetes in the near future. The advantage of such a system is that all processes can be defined in a version controlled system, allowing for a system to be replicated with very little effort. We would very much like to provide a reusable packaged system to provide other organisations with a similar platform and are looking for funding to provide this.

Software Collaboration and Continuous Integration

We have developed all systems using a distributed version control system underpinned by Git, with code hosted on Github. Our primary software output has been Rp-Sidekick which provides a framework for adding ancillary services (e.g. monetary incentives, RedCAP integration) to the RapidPro message sender. Other analytical outputs (e.g. the sampling system) has also been handled using Git, however the development methodology has not used the same development cycle.

Our software development cycle focuses on continuous integration where a suite of automated tests are run before integrating changes. We have aimed for high test coverage in most of the software products developed (e.g. RP-Sidekick has 96% test coverage). While requiring more up-front effort, this way of working allows for system upgrades with minimal disruption.

Experimental Design Support

We have supported the following experimental designs with this framework:

Individual Randomized Control Trial
Cluster Randomized Control Trial
Sequential Multiple Assignment Randomized Trial (S.M.A.R.T)
Experiment as a Market (EXAM)
Factorial Design

Technical Support for ‘Nudges’

We developed a scalable software platform that can be integrated with our message sender RapidPro. This is called RP-sidekick. The system is built around a Django web framework and can have additional services added with relative ease. As of know we have built the following integrations:

External data collection platform reminders (RedCAP)
Airtime and Data bundles (TransferTo)

We are also in the process of assessing an integration with a mobile wallet provider to provide direct cash transfer incentives.

Analytics Architecture

The capacity to conduct statistical analyses on the data generated during experimentation is extremely important, but needs to be tempered with data protection requirements and operational requirements. For example, having individuals run data analyses on their own PCs creates difficulty in reproducibility (individuals may end up having slightly different versions of the analytics software on their machines) as well as data protection (copying data onto machines would be extremely difficult to police). In order to avoid these problems we set up an internal analytics infrastructure that utilises docker instances to launch different services (e.g. python, R etc.). This allows us to have a centrally maintained analytics image, ensuring that all users are operating from the same analysis platform. It also allows us to run the analysis from our internal databases, providing a much higher level of security.

Operational Machine Learning and Artificial Intelligence

In order to support natural language processing operations (i.e. semi-automated responses, message prioritisation) we deployed the TensorFlow Serving machine learning model serving architecture. This architecture provides for scalability, as well as supporting all TensorFlow models. Other types of ML models can also be supported under this architecture. We have currently deployed a FAQ matching model and have assessed the operational aspects of the system (Daniel, Brink, Eloff, & Copley, 2019)

Data storage

All data are stored within a Postgres database, with access provided to external databases using Postgres Foreign Data Wrapper (postgres_fdw). This allows granular control over data access permissions. Where absolutely necessary, we have shared encrypted anonymised data with partners as well. When this has been done, we have ensured that the data will be stored on a suitably secure server, and have only ever shared anonymised data (Copley & Grant, 2018b).

Real-time Monitoring

Any system of this scale requires extensive monitoring applied to all components. Monitoring can take a variety of forms. At its simplest it is simply providing the capacity to detect errors and system failure within the system. This is however, inherently fragile as it requires a human operator to keep an eye on it. A more advanced monitoring system provides the ability to alert operators of potential errors. This is superior to the manual alerting system, however it runs to risk of over-reporting, reducing the likelihood that an actual problem will be acted on. In this case we have been very deliberate in how we have monitored the various sub-components. Long term systems are monitored using alerts generated by sentry.io and Grafana. This would include technical aspects like disk space, network accessibility, system up-time etc. For shorter experiments we have used a combination of the above systems, together with visualising the experimental state with graphs generated on ReDash. Typical plots would include the number of consents/interactions we had in a given experiment and counts of the message status. A human ‘operator’ is responsible for keeping track of those systems during the experiment, with the added benefit that early results can be shared with partners before a rigorous analysis is conducted.

Rapid Analysis

Data visualisation is important, however we also required the ability to conduct statistical analyses. Given the rapid nature of the tasks, a system was required that facilitated quick and easy analyses. Typical bottlenecks are as follows:

Data sharing
Non-reproducible analysis environments
Reusable analysis software

To this end we deployed an online analysis environment based around the RServer infrastructure. This provides a way of easily integrating into the data systems in a secure fashion. In addition it ensures that all analysis is carried out using the same software and libraries.

For more on this read our post on “Collaborative research at Patient Engagement Lab”

Anonymisation frameworks

Identifiable data collected as part of a programme cannot be shared with external parties without explicit consent of the participants. It is however possible to share anonymised data. We followed a data sharing recommendations given by the European GDPR (Charles Copley and Grant 2018b; El Emam and Dankar 2008; Information Commissioner’s Office 2012), specifically using local suppression. This best practice definition and the recommendations form part of our data sharing agreement. See our post on “How PEL anonymized patient data” for more details.