Should Health Systems Own and Sell Your (De-Identified) Data?

Published in

CITRISPolicyLab

6 min readNov 16, 2021

Healthcare is generating a massive amount of data, and currently comprises 30% of the proportion of the world’s data volume. Although some of this data is being generated from wearables or other IoT devices in the hands of consumers, health systems hold massive troves of data. Each time a patient sees their provider or goes to a hospital, data ranging from a patient’s vital signs, demeanor, and physical exam to personal medical history including sensitive topics such as drug or alcohol use and sexual history are recorded.

The vast majority of this health data is held within electronic medical records (EMRs), with 88% of healthcare providers reportedly using an EMR system during their patient encounters. In their inception, EMRs were intended to act as a sort of ‘container’ for health data, able to assist healthcare providers to easily look up past records of a patient and perform functions such as writing notes or efficiently billing for services. However, as the potential for big data to accelerate the identification of pharmacologic targets and their therapies have begun to emerge, the other potential uses of this data are coming into sharper focus and consideration, including the use of analytics for improving outcomes within systems and sale of patient data.

What to do with this data remains a challenge for medical systems, which are largely operated as not for profit entities designed to serve the patients. Although some health systems may have steering committees to guide use of the data as well as potential benefit and evaluation of the financial compensation for this data, the committee decisions are often made by a small member of faculty in health systems, and hidden from the public view at large. This can create different standards for sharing of data with external companies, as well as opacity in the types of deals that health systems are willing to make with their corporate associates.

Consent for the sharing of patient data can also pose a challenge, and may be included in a short sentence that a patient signs when first interacting with the health systems. The consent may note that their data may be shared with affiliates for research purposes or to provide care and is often buried within many dense lines of technical jargon. Patients may also choose to participate in biobanks, which may collect biological specimens from the patient as part of research endeavors and may or may not include language involving the sale of their data to other companies. Patients being treated for a medical condition are unlikely to be aware this data may be de-identified, aggregated, and sold to external companies. Thus, consent poses a challenge to health systems that may initially use data for provision of care, but may identify many other uses for data that weren’t initial possibilities when the patient first signed their consent form. The issue of when to re-consent patients may also be challenging and left up to a researcher’s discretion depending on the language of the initial consent, as reconsent may be time consuming and result in participant loss.

The relationship between health systems and private industry is not new, with significant funding from pharmaceutical companies and collaborations previously the playing a major role in therapeutic advancement and the academic mission of health systems. However, the entry of more traditional direct-to-consumer companies like Apple, Google, and genomics companies such as 23andMe into deals with health systems have started to push the boundaries of pre-existing models of collaboration. Privacy laws differ between covered entities (defined as healthcare institutions, medical insurance companies, and healthcare clearinghouses) versus healthcare technology companies, which can have far broader implications for individuals who may have more rights surrounding their data as a consumer than as a patient.

Data collected by consumer-facing corporations and digital health companies are governed by modern data privacy laws, often intended to protect the digital rights of consumers by offering transparency into the type and purpose of data gathered on their consumers as well as the right to opt out of sale and to delete data. In the United States, the California Consumer Privacy Act (CCPA) with the amended California Privacy Rights Act (CPRA) were recent laws passed as a result of pushback from consumers against corporate data practices. It was inspired by a conversation between real estate developer Alastair Mactaggart and a Google developer during a cocktail party, who told the developer “if people knew how much we knew about them, they’d be really worried”, and incorporates similar provisions to privacy laws in the European Union, where the General Data Protection Regulation was enacted.

However, data from covered entities such as hospitals and health systems falls under different data protection laws than other non-covered entity corporations including health technology companies, namely the Health Care Portability and Protection Act (HIPAA). HIPAA was enacted in 1996 to provide standards for protection of health information while also balancing the need for data to flow freely between health institutions and their many partners, ranging from health insurance claims processors to healthcare clearinghouses and data analytics partners. The law allows large tech companies including 23andMe, Google, Apple, and others willing to pay to work with health systems to access consumer data in the form of de-identified patient data.

As per the recent S-1 prospectus filing of the company, 23andMe has seen significant revenue decline with their testing kits, possibly related to concerns about data privacy after the $300 million dollar sale of consumer genomic data to Astrazeneca. The company has announced that it expects a larger proportion of its revenue to come from activities related to its pharmaceutical endeavors and discovery. The company also notes on its website that it partners with health institutions and researchers to generate actionable insights. This data is likely to be shared in a de-identified form between health systems and the company; however health systems may earn significant money by sharing the data that is unreported to patients, and the company may have access to significant amounts of data that consumers on its platform would not consent to providing. In spite of de-identification methods, there remains significant concern that data that is stripped of common identifiers is not fully de-identified, as investigators have shown a 99.98% ability to re-identify Americans using 15 demographic attributes even after ‘de-identification’.

These practices raise numerous ethical questions for health systems, which must balance financial solvency, the academic mission to develop novel insights and therapeutics for patients (which often require significant collaboration with corporate partners), and the rights of their patients, who may be unaware that their data is being used for collaborations. Additionally, the ethical and economic question of whether health systems that are receiving financial incentives for sharing de-identified patient data should have a system of checks and balances that includes patient advocates who ensure their rights are upheld.

There needs to be more transparency, and accountability in health systems surrounding patient data to ensure that relationships between these systems and patients are maintained. Patients may contribute their data willingly, hoping that it will be used to help others in the development of therapeutics but be unaware of the financial gains the health system is making from their data. Health systems should develop a unified framework that is based on consent and transparency when dealing with the economic implications of data sharing that is sensitive to the changing landscape of privacy and security.

Similar to rights under other modern privacy laws, patients should have the ability to easily withdraw their data at any time from the de-identified pool, as well as understand who their data is being shared with and for what purpose. Although deals between pharmaceutical companies and health systems are considered trade secrets, there should be more transparency and accountability as to the frequency and magnitude of these deals involving patient data. Ultimately, this will not only protect patients and consumers, but health systems and their corporate partners as well. Developing inclusive economic strategies that also benefit patients will improve trust between systems and patients, and encourage patients to continue sharing data with health systems and their partners. Health systems must always be cognizant that they rely on the trust of their patients. Doing everything possible to maintain that trust in the face of changing regulation, technologies, and economic opportunities with data is of paramount importance.

Bethany Doran MD/MPH is an entrepreneur and cardiologist. She is a data privacy and ethics advisor to the CITRIS Foundry at UC Berkeley and guest lecturer at Singularity University.

Should Health Systems Own and Sell Your (De-Identified) Data?

Written by Bethany Doran