What’s in store for the future of healthcare data?

Some key findings from “Better, Broader, Safer: Using Health Data for Research and Analysis”

Weavechain
Weavechain
5 min readJun 2, 2022

--

Introduction

Before the digital revolution, health data was stored in physical files. Now, we have millions of data points for billions of people across the world with patient history, treatments, referrals, and diagnostics.

This data is currently stored in inefficient databases and legacy systems that make it difficult to extrapolate useful information. The National Health Service (NHS), England’s healthcare system, funded a report to solve the issues with healthcare data, led by Professor Ben Goldacre .

Professor Ben Goldacre is an expert in this field as Professor of Research Based Medicine and Director of Applied Data Science at Oxford University. He has written books dealing with data and medicine selling over 700k copies and created open-source technology used by thousands of researchers.

Commissioned by the NHS, Goldacre set out to find the next generation of data practices, tools, and ideals that will make the healthcare system better. The 112-page research report encompasses thousands of user interviews, work with practitioners in the field, and more. It outlines principles, systems, and goals as we think about healthcare data in the future. You can read the full paper here , but in this article, we will be focusing on a general summary of the paper and two specific recommendations from the team-Trusted Research Environments (TREs) and Reproducible Analytical Pipelines (RAPs).

Finally, we’ll discuss the technology we are building at Weavechain to support solutions like those outlined in the NHS paper. More on that below.

Summary

The main goal of the paper was to find ways to deliver better, broader, and safer use of the NHS data for analysis and research. The NHS has millions of data points. The issue is how we can leverage this data effectively to find insights.

Currently, data platforms are fragmented and preparation of data for research is difficult. The goal is to invest in a small number of secure platforms to unlock the potential of NHS data. The full text covers detailed technical advice and recommendations that aim to address the many complexities that exist within healthcare data. NHS said they would be willing to invest up to £200 million on Trusted Research Environments, which would require less investment than digitizing one hospital.

The security of these new platforms is crucial to protect sensitive health data. Therefore, the aim is to create a few centralized platforms to lower the chance of duplication risks while maintaining openness and reusable code. The focus should be on developing a strong system while avoiding a few small, isolated projects.

In this article, I will be focusing on two key areas of the paper: Trusted Research Environments (TREs) and Reproducible Analytical Pipelines (RAP).

Trusted Research Environments (TREs):

In medical research, there are many problems: increase in cost due to multiple technical implementations, monopolies of databases, and inability to reuse code. However, one of the most critical issues is security, when moving sensitive data between different locations.

The purpose of the TREs is to create an ecosystem where researchers can securely work with data on-site using standardized environments for sharing code and practices. Some benefits of TREs include protecting the patient’s privacy, wider access to more data points, and open and collaborative working methods.

To effectively implement TREs, three key components need to be in place.:

  • A service wrapper, which would serve as the common framework used by all TREs. This would allow for an open, collaborative system to be implemented.
  • An underlying generic computation and database service to ensure a robust open-source environment. (Python, R combined with SQL, for example)
  • A software program that works with NHS data, with emphasis on an open and shared approach to all code and technical documentation.

A live example of a TRE can be found here on the NHS official site .

Reproducible Analytical Pipelines (RAPs):

Reproducible Analytical Pipelines are compilations of ideas and actions that promote best practices for modern, open collaborative work with data. Essentially, they serve to make data analysis easier, faster, and more efficient.

The report talks about minimizing time wasters and reducing human error. For example, current systems have drag and drop operations and recurring tasks need to be done manually. Instead, new software should aim to create open-source functions that can be used by anyone through writing scripts for commonly used actions.

Open source is a huge theme throughout the paper. RAPs are no exception. By building on open-source software, such as Python or R, you allow for increased collaboration and sharing of tools and functions that can be used universally. In addition, all pipelines should be open source so anyone with clear documentation for review and reuse.

How does Weavechain support these solutions?

Weavechain is building an enterprise-grade application that enables big data to gain Web3 properties without ever leaving its home database. Through the use of its smart hashing algorithm to any blockchain, Weavechain certifies the data is immutable and offers data lineage, providing access to data that are reliable to use with robust, granular permissioning so it is simple to grant and revoke access to whole datasets or parts of them, depending on the user’s needs. To ensure computation results are verifiable, multiple participants in sidechains perform the same calculations.

Take health data, for example. Using confidential computing as described above, Weavechain allows researchers access to data that they can trust has not been tampered with, since it is immutably stored with recorded lineage. This allows researchers safe access to sensitive health information. The NHS paper discusses that it is important to make sure healthcare data is kept safe and immutable-Weavechain enables that condition by giving data Web3 properties. Weavechain establishes a TRE for researchers to work within, and additionally establishes the RAP for anything that flows through its node. Data can be shared with anonymized or redacted sections, and access can be granted with the most granular of permissions and revoked at any time to ensure full privacy protections and security for sensitive data.

Conclusion

Overall, the NHS report serves to address many of the problems that we see in healthcare data today. Its main emphasis is on the creation of a few centralized systems that serve the needs of researchers and data scientists for effective collaboration through open-source methods. Weavechain looks forward to supporting such critical solutions with seamless integration so that large institutions like the NHS can enjoy the benefits of Web3 without dramatic changes to their existing operations.

To learn more about how Weavechain can help your organization:

Originally published by Kushagra Aryal at https://www.weavechain.com on June 2, 2022.

--

--