The interplay between data-related harm and the secondary use of health data

Blanka Wawrzyniak
Canvas
Published in
14 min readJun 13, 2024
Courtesy Ewan Affleck & Ryan Hunter

Historically, the health industry has primarily focused on the harms associated with data privacy breaches, particularly those stemming from the over-disclosure of data. This article highlights the manifold forms of health data-related harm beyond privacy breaches, many of which have traditionally been overlooked. It emphasises that harm may arise not only from the over-disclosure of data but also from its underutilisation, which can adversely impact the secondary use of data — contributing to individual, population and health system-based harm.

The article was developed in connection to the ODI Research Fellow Scheme of Blanka Wawrzyniak who has been working with the ODI on the secondary use of health data. It was co-authored by Blanka Wawrzyniak from Instrat Foundation, Poland, Ewan Affleck from The College of Physicians & Surgeons of Alberta, Canada and Eric Sutherland from The Organisation for Economic Co-operation and Development (OECD), France.

What is Health Data-Related Harm?

For decades there have been buoyant predictions about the capacity for data to revolutionise health and wellbeing. Health prophets have predicted that by enlisting genomics and personal health information, and powered by advanced computing and artificial intelligence, data can guide mankind to an enlightened state where most health concerns can be foretold and mitigated¹.

Yet even more modest predictions of data-driven improvement in health system function have often failed to materialise; as Klaus Hoeyer, professor of Medical Science and Technology Studies at the University of Copenhagen laconically observed in his book Data Politics, “there is something about healthcare in high income countries that is remarkably resistant to change.”²

It is probably safe to suggest that the relationship between the healthcare industry and data is fraught. Despite the potential value of data-driven healthcare, many nations around the globe have struggled to architect a sympathetic data environment that balances the rights, privileges and needs of individuals, populations, Indigenous peoples, governments, and private sector technology vendors in a manner that promotes the health and wellbeing of all parties. One plausible contributor to the lack of progress may be the absence of a clear articulation of potential harm arising from health data use.

Health data-related harm has been defined as the “damage suffered by individuals, populations or the health system arising from the use, non-use or misuse of health data”.³ Historically, much of the focus in the health industry on data-related harm has been on damage involving the “oversharing” of data in the context of privacy breaches. Jones et al, suggested in 2016 that the “under-sharing” of data can also be problematic, observing that harm arising from the under-sharing or non-use use of data has been poorly studied, but is “nevertheless a real problem with widespread and serious, if largely unquantifiable, consequences.”⁴

This is born out in the case of Greg Price, a 31 year old Canadian pilot and engineer who died in 2012 while under the active care of the health service. Concerns with the case triggered an independent investigation by the Health Quality Council of Alberta, which concluded that poor practices with the sharing of Mr. Price’s health information were a significant contributor to his death. While Greg Price is still believed to be the only Canadian whose death is officially ascribed to poor health information management, a federal government expert advisory group acknowledged in 2022 that a failure to rectify deficits in health data design in Canada ”risks continued escalation of health care costs, underperformance of health services and poor health outcomes including avoidable illness and death, low levels of innovation, perpetuation of health inequities, and ineffective responses to future public health threats.”⁵ In short, poor health data design in Canada is contributing to widespread harm to individuals and the healthcare system.

Domains of Health Data-related Harm

In the fall of 2023 a Canadian data think tank published a report, Interoperability Saves Lives, in which they propose a comprehensive framework for health data-related harm.⁶ They posit that health data-related harm can arise from the use, non-use or misuse of health data, and can result from both the oversharing or under-sharing of data. They separate types of harm into three overlapping domains: individual, population-based and health system harm. In turn nine distinct categories of health data-related harm are described (table 1).

Table 1: Domains and categories of health data-related harm

Courtesy Interoperability Saves Lives⁷

The Health Data-related Harm framework establishes a comprehensive and concise continuum of harm that can arise from the use, non-use or misuse of health data. These forms of harm are interdependent, and can have a cumulative impact on the integrity of health system function and welfare of populations and individuals. Forms of harm are not hierarchical, but are interwoven in a matrix of factors that ultimately, either directly or indirectly, impact the health and wellbeing of individuals (figure 1).

Figure 1: Matrix of Individual, Population and System Health Data-related Harm

Courtesy Ewan Affleck & Ryan Hunter

Importantly the framework articulates clear goalposts for the design of health data systems in the form of an accountability to minimise all categories of health data-related harm. This is relevant in an industry which arguably has approached the mitigation of harm arising from health data in an idiosyncratic, if not overtly disorganised manner, neglecting some forms of data-related harm — such as data-driven health and wellbeing, cultural rights to health data, or health workforce burnout — while focusing predominantly on others, most notably privacy and security. The Health Data-related Harm framework is a call to the health industry to assume an evidential and balanced approach to all-cause health data-related harm.

Categories 5–9 of the Health Data-related Harm Framework (table 1) principally refer to uses of health data that extend beyond the rights or needs of the individual about, or for whom, the health data were collected, often referred to as the secondary use of data. The secondary use, or analysis of pooled health data, can provide powerful insights to guide health sector decisions and benefit public good. Health research, public and population health, health management, and health sector innovation are among the domains that can benefit from the study of personal health information. The secondary use of health data has been defined as the use of ‘‘personal health information (PHI) for purposes outside of direct health care delivery’’.⁸

Data Protection Laws and Health Data Reuse: A Love-Hate Relationship

An evaluation by the World Health Organization of the capacity of European nations to support the secondary use of health data revealed “obstacles in sharing and using health data, and consequently, the production of health statistics as a result of data protection frameworks that are not appropriately geared to enable secondary use of health data”.⁹ Canada, as well as other nations demonstrate similar obstacles to the use of secondary health data.¹⁰ ¹¹

Some countries have introduced national policies and regulations which apply in addition to the GDPR in order to refrain from overreliance on the data protection regulation and the consent mechanism.¹² However, most EU Member States enforce very stringent rules regarding the processing of health data, citing privacy protection as a priority.¹³ This privacy related bias serves as an obstacle to the legitimate secondary use of data for the purposes, such as research and innovation.¹⁴ Notably, the risk-averse stance, particularly prevalent in the EU, often leads to an excessive reliance on consent, even when data protection regulation does not mandate it.¹⁵

What is more, stakeholders in the EU emphasise the disparity in GDPR interpretation across Europe and the presence of additional national regulations complicates the exchange of health data for secondary use between Member States and associated countries.¹⁶ Another issue is the absence of a common European interpretation of what constitutes “sufficient anonymisation” to transform personal data to non-personal data. This is why stakeholders in the EU report that they are often obliged to treat all data as personal data due to this lack of clarity.¹⁷ Consequently, institutions adopt an overly cautious approach in applying regulations such as the GDPR, leading to the denial of requests for sharing health data for secondary purposes, even when data privacy and ethical standards are met.

There are derogations from the general ban on processing sensitive data that reference the notion of “public interest”. For instance, processing may be lawful if it is considered “necessary for reasons of substantial public interest” or “necessary for reasons of public interest in the area of public health, such as safeguarding against serious cross-border health threats or ensuring high standards of quality and safety in healthcare, medicinal products, or medical devices” (a justification notably relevant during the pandemic). However, this type of processing needs to be proportionate to the aim pursued, and the public interest must be substantial, aiming to strike a balance between opportunities and risks associated with processing health data.

Furthermore, what makes the application of these derogations more challenging is that the exact distinction between a “substantial” public interest and a “normal” one remains undefined.¹⁸ When faced with such ambiguities, data controllers (e.g. public institutions), often find it prudent to determine that secondary use is not in the public interest unless it clearly benefits every member of the public. Since meeting such criteria is nearly impossible, this presents a significant barrier to the secondary use of health data.

The over-reliance on data protection laws, along with discrepancies in GDPR interpretation, makes it particularly challenging to navigate the rules governing the use of health data for secondary purposes, both within and between EU countries. Interestingly, there are cases where a nation effectively handles data protection legislation but still struggles to efficiently reuse data. In Canada, while there are robust legislative and regulatory mechanisms to ensure the protection of health data privacy, there is a virtual absence of public policy or regulatory levers that protect individuals from other forms of health data-related harm.¹⁹ Instead of prioritising absolute privacy protection, which often slows down or obstructs research and innovation, the decisions regarding the secondary use of health data should be based on balancing the benefits against the risks²⁰ (including all spectrum of data related harms).

How a Lack of Trust Derails Data Use

The risk-averse culture toward health data sharing is particularly visible in countries where there is a significant lack of trust in public authorities and where incidents involving health data breaches have occurred.²¹ For instance, the NHS’s new data platform in the UK currently faces the risk of derailment due to a lack of public trust, stemming from past controversies and patients’ inability to opt out of sharing their personal medical records. While NHS England argues that an opt-out option is not necessary as patient data will be anonymised before sharing, guaranteeing their identity remains protected, the mandatory data-sharing system has triggered considerable backlash.²²

Similarly, in Poland, citizens’ reluctance to share personal data with public authorities became particularly evident in 2020 when public opinion resisted new digital health system proposed by the Ministry of Health.²³ The objective of this initiative was to establish a digital medical information system mandating all entities providing medical services to input data on various medical events, including pregnancy. While the legislative proposal was based on recommendations from the European Commission, the local context (concerns related to the abortion ban, cases of data misuse, and invigilation) led people to perceive the regulation as an attempt to exert control over Poles. This shows that various data breaches incidents involving institutions foster a public perception that data misuse is a prevalent risk, overshadowing potential benefits deriving from using data for the public good.

It is evident that regulatory barriers, combined with a lack of understanding of data protection measures and lack of public trust (which may manifest as fear), can have detrimental effects on the secondary use of health data, leading to data-related harm.

Understanding the Full Spectrum of Health Data-Related Harm

As detailed earlier, data-related harm can be categorised into three main areas: individual, population-based, and health system harm. While all three are significant and impact the health and well-being of individuals, the latter two are often absent from public debate concerning risks resulting from data sharing. To address this issue, it is necessary to enhance literacy in both public administration and the health sector regarding the scope and implications of health data-related harm. This discourse is hindered however, for as a novel concept, the prevalence and impact of most forms of health data-related harm have not been studied in an intentional or comprehensive manner. Most evidence pointing to the impact of health data-related harm is anecdotal, or must be inferred from studies focused on adjoining topics like health data interoperability.

The literature on health data interoperability — a necessary condition of digital health data sharing — suggests that there is a relationship between the inability to effectively share health data and the harm of individuals, populations, and health systems. Enhanced health data interoperability has been associated with decreases in mortality²⁴, and health system cost.²⁵ ²⁶ In Canada it has been observed that health inequities arise from poor health data access through limited internet connectivity to rural, remote and Indigenous peoples.²⁷ Compounding this, the OECD found that adult competency scores for digital literacy are lower among Indigenous and immigrant populations in Canada, thereby promoting health data-related inequities.²⁸ A lack of capacity to share data effectively was felt to have obstructed public health insights around COVID in both Canada and the United States²⁹ ³⁰, potentially leading to population-based harm.

Shift to a Harms-based Process for Secondary Data Access and Use

Neglecting to utilise health data can impede medical research by hindering researchers’ access to comprehensive datasets crucial for identifying trends, patterns, and correlations essential for innovative therapies and addressing public health challenges. Limited access to health data can disrupt healthcare delivery systems and compromise patient safety, as healthcare providers may struggle to make well-informed decisions without complete patient information.

Health data reuse is also essential for public health surveillance efforts, enabling timely detection and response to disease outbreaks, monitoring chronic disease trends, and implementing preventive measures. Failure to leverage health data results in missed opportunities for personalised medicine and preventive treatment, depriving patients of tailored healthcare and interventions. Additionally, healthcare innovation heavily relies on health data reuse to provide insights into disease mechanisms, treatment efficacy, and patient outcomes, driving the development of new technologies, medical devices, and digital health solutions aimed at improving healthcare delivery and patient outcomes.

It is crucial to fully leverage the value of health data and dispel the misconception that data sharing inherently compromises sensitive data protection goals. When protective measures are adequately implemented, the risks of health data sharing become minimal, tipping the benefit-to-risk scale heavily towards the benefits.³¹ There are EU nations where this risk-averse stance is less prevalent and the GDPR is interpreted in a manner that permits the reuse of anonymized health data for research, diagnostics, and personalised healthcare endeavours. Notably, institutions like Finland’s Social and Health Data Permit Authority, Findata, serve as exemplars, showcasing that it is feasible to implement robust policies and models for secondary health data sharing that balance effectiveness with privacy safeguards.

This suggests that with a thoughtful approach to health data design and use, all forms of data-related harm can be minimised in concert; there need not be a tradeoff between harms arising from the oversharing and undersharing of health data. The aim is for a public policy framework to comprehensively address all potential data-related harms associated with health data, fostering an evidence-based and nuanced approach to its reuse. This approach seeks to strike a balance between sharing health data for beneficial purposes while also ensuring robust protections are in place.

  1. Samson, D. (2018). Aging will be a treatable disease within 12 to 18 years, according to timeline from Singularity University.
  2. Hoeyer, K. (2023). Data Paradoxes: The Politics of Intensified Data Sourcing in Contemporary Healthcare, (page 2).
  3. Affleck, E., Sutherland, E., Lindeman, C., Golonka, R., Price, T., Murphy, T., Williamson, T., Chapman, A., Layton, A., Fraser, C. (2024). Human Factor Health Data Interoperability. Healthc Pap.
  4. The other side of the coin: Harm due to the non-use of health-related data Kerina H. Jones a,∗, Graeme Laurie b, Leslie Stevens b, Christine Dobbs a, David V. Forda, Nathan Leac
  5. Affleck, E., Castle, D., Stafford D., Dewar, J., Harvey, M., Hoffman, S., Knoppers, B., M., Maybee, A., Mamdani M., McGrail, K., Nesbitt, J., Neudorf, C., Rees, G., Smylie, J., Murphy, G., T., Tipples, G., Wolfson, M.(2022). Pan-Canadian Health Data Strategy: Toward a world-class health data system.
  6. Affleck, E., Murphy, T., Williamson, T., Price, R., Wolfaardt, U., Price, T., Layton, A., Hamilton, B., Dean, S., Frazer, C., Chapman, A., Shute, R., West., Denman, M., Golonka, R., & Lindeman, C. (2023). Interoperability Saves Lives.
  7. Affleck, E., Murphy, T., Williamson, T., Price, R., Wolfaardt, U., Price, T., Layton, A., Hamilton, B., Dean, S., Frazer, C., Chapman, A., Shute, R., West., Denman, M., Golonka, R., & Lindeman, C. (2023). Interoperability Saves Lives.
  8. Safran, C., Bloomrosen, M., Hammond, W.,E., Labkoff, S., Markel-Fox, S., Tang, P,.C., Detmer, D.,E. (2007). Expert Panel. Toward a national framework for the secondary use of health data: an American Medical Informatics Association White Paper.
  9. World Health Organization (2022). Meeting on secondary use of health data.
  10. Affleck, E., Sutherland, E., Lindeman, C., Golonka, R., Price, T., Murphy, T., Williamson, T., Chapman, A., Layton, A., Fraser, C. (2024). Human Factor Health Data Interoperability. Healthc Pap.
  11. Areco, K.,N., Konstantyner, T., Bandiera-Paiva, P., Balda, R.,C.,X., Costa-Nobre, D.,T., Sanudo, A., Kiffer C.R.V., Kawakami, M.,D., Miyoshi, M.,H., Marinonio, A.,S.,S., Freitas, R.,M.,V., Morais, L.,C.,C., Teixeira, M.,L.,P., Waldvogel, B., Almeida, M.,F.,B., Guinsburg, R. (2021). Operational Challenges in the Use of Structured Secondary Data for Health Research. Front Public Health.
  12. TEHDAS (2021). Summary of results: case studies on barriers to cross-border sharing of health data for secondary use.
  13. European Commission (2021). Assessment of the EU Member States’ rules on health data in the light of GDPR Specific Contract No SC 2019 70 02 in the context of the Single Framework Contract Chafea/2018/Health/03.
  14. Canadian Institutes of Health Research (2022). Secondary Use of Personal Information in Health Research: Case Studies.
  15. Wawrzyniak, B. (2024). From concerns to consent: addressing the issue of trust and other challenges impairing secondary use of health data in Poland and beyond.
  16. TEHDAS (2021). Summary of results: case studies on barriers to cross-border sharing of health data for secondary use.
  17. TEHDAS (2021). Summary of results: case studies on barriers to cross-border sharing of health data for secondary use.
  18. Global Alliance for Genomics and Health (GA4GH), Beauvais, M. (2021), GDPR Brief: the public interest and the GDPR.
  19. Affleck, E., Murphy, T., Williamson, T., Price, R., Wolfaardt, U., Price, T., Layton, A., Hamilton, B., Dean, S., Frazer, C., Chapman, A., Shute, R., West., Denman, M., Golonka, R., & Lindeman, C. (2023). Interoperability Saves Lives.
  20. Ségolène Aymé, Choquet, R., Devillers, L., Gilard, M., Kelly-Irving, M., et al. (2023) Benefits and Risks of Using Health Data for Research: Report of the Health Data Hub Scientific Advisory Board-October.
  21. Wawrzyniak, B. (2024). From concerns to consent: addressing the issue of trust and other challenges impairing secondary use of health data in Poland and beyond.
  22. The Guardian, Campbell, D. (2023), NHS data platform may be undermined by lack of public trust, warn campaigners.
  23. Posner, L. (2022), “Poland’s New ‘Pregnancy Registry’ Raises Red Flags; Some Polish women feel their privacy and autonomy are on the line”, Think Global Health.
  24. Usher, M. (2018). et al., Diagnostic Discordance, Health Information Exchange, and Inter-Hospital Transfer Outcomes: a Population Study.
  25. Edmond Li et al. (2022). The Impact of Electronic Health Record Interoperability on Safety and Quality of Care in High-Income Countries: Systematic Review.
  26. Canada Health Infoway (2023). Quantifying the Benefits of Digital Health Interoperability.
  27. Government of Canada, Canadian Radio-television and Telecommunications Commission (2023). Broadband Fund Closing the digital divide in Canada.
  28. Hafner, M., et al. (2022). The Potential Socio-economic Impact of Telemedicine in Canada.
  29. Bubela, T., Flood, C., M., McGrail, K., Straus, S.,E., Mishra, S. (2023). How Canada’s decentralised covid-19 response affected public health data and decision making.
  30. Greene, D.,N., McClintock, D.,S., Durant, T.,J.,S. (2021). Interoperability: COVID-19 as an Impetus for Change.
  31. Ségolène Aymé, Choquet, R., Devillers, L., Gilard, M., Kelly-Irving, M., et al. (2023) Benefits and Risks of Using Health Data for Research: Report of the Health Data Hub Scientific Advisory Board-October.

--

--

Blanka Wawrzyniak
Canvas
Writer for

Digitalisation & data | Digital Economy @ Instrat | Lawyer | Policy Analyst