The Future of Predictive Analytics in Healthcare

O'Reilly Media
Published in
9 min readJan 26, 2021


Data will always be the lifeblood of predictive analytics systems, but access to the right data can be challenging. Technology advances promise to increase the sources of data, as well as gaining more information from currently available data. While the rate of increase in compute power has struggled to keep up with Moore’s law in recent years, processors and processing techniques will continue to improve. And we will see predictive analytics increasingly being embedded in real-time virtual representations of systems, devices, or even patients — called digital twins.

The use of predictive analytics in healthcare is still at an early stage. Going forward, further developments will occur in the enablers for the technology.

Digital Twins

A digital twin is a computational model of a physical object or system that can be adapted in real time to reflect changes in the physical counterpart. This real-time replication is supported by a data stream from sensors that are monitoring the physical system. Diagnostic, predictive, and prescriptive analytics can be applied to the digital twin to identify interventions that can be used to optimize the performance and reliability of the physical counterpart.

Outside the healthcare world, digital twins are a foundational technology for the industrial IoT. They are used to manage complex assets, from jet engines and cars to power stations and manufacturing processes. We are now at the early stages of healthcare companies exploring how digital twins will be used in personalization of care, optimization of healthcare systems, and life-cycle management of devices and manufacturing processes (Figure 13).

Figure 13. Schematic architecture of a digital twin for use in healthcare applications

Digital Twins for Clinical Decision Support

A compelling vision for healthcare includes the creation of a digital twin of each patient as an ultimate expression of precision medicine. This brings together all lifetime data generated for a patient, including imaging, lab results, vital signs, genomics, microbiome, immune system, and social and behavioral determinants, to create a unique model that can be interrogated by AI methods to provide powerful insights. Clinicians can simulate treatments to identify the best options and avoid unnecessary or unsuitable courses of action and thereby optimize individual outcomes.

The challenges to implementing such a vision are huge. Simulation of a complete human would be a vast computational task. Bringing together the data for a single patient brings with it many hurdles, including meeting data protection regulations, breaking down data silos, and ensuring that the structure and formats are suitable for assimilation into a digital twin. The regulatory status of such systems would need to be clarified, and claimed indications for use carefully validated. And finally, the operation and outputs would need to be clinically credible and explainable; a natural skepticism arises to accepting clinical advice from a “black box” and an equal concern about such technology “dumbing down” clinical decision-making.

These considerations point toward the initial implementation of digital twins for the personalization of medicine to be done within tightly defined contexts. For example, digital twin models of the heart have been developed at the anatomical level, and there are many initiatives to integrate electrophysiology, pressure, and genomic data into these models to help predict optimal cardiological interventions. In oncology, a wealth of diagnostic information from imaging through genetic profiling could be used to help plan the best therapeutic approach, including elements of surgery, radiotherapy, and drug treatments.

A further interesting application is in the use of digital twins to run virtual clinical trials. Well-designed randomized control trials (RCTs) are the gold standard for demonstrating evidence of therapeutic efficacy, but these results are often poorly applicable to wider patient populations or different clinical contexts. This approach has been piloted in CKD patients, a group that is frequently excluded from general RCTs. As a result, dosing regimens are not optimized for this population. Populations of thousands of digital twins of real CKD patients that are representative of a wide range have been used to run virtual clinical trials of drugs used in managing side effects of their treatment and to carry out such optimization safely and cheaply in silico before careful validation in real patients.

Digital Twins for Healthcare Operations

Besides supporting clinical decisions about individual patients, digital twins of healthcare facilities promise to drive efficiency in healthcare processes. For example, GE Healthcare is already actively promoting operational planning as a service, based on digital twin technology, modeling potential changes in operational strategy, capacities, staffing, and care delivery models.

GE Healthcare sees several key areas where the technology will revolutionize planning. Initial simulation will help demonstrate that facilities, patient flows, and services will meet the requirement before they are operational. Processes often touch many parts of the organization, and often many stakeholders are involved with very different ideas about what changes should be made. Process improvement projects can be identified that drive clear local and system-level goals. Ideas from multiple stakeholders can be objectively tested for overall impact and accepted, rejected, or refined in a collaborative approach that helps align the organization to the changes to be made. Once systems are operational, real-time data can be used to power ongoing short-term forecasts to make data-driven operational decisions.

Digital Twins of Medical Devices

Another area that will see the application of digital twins is in product life-cycle management.13 Medical device companies are often global in their scope, and developments can be carried out by widely dispersed teams. Product families are often based on shared platforms and subsystems. On top of this are variants to meet the requirements of local markets and regulatory jurisdictions. Once released in the market, individual devices can be subjected to very different use patterns and intensities. Designs also evolve — driven by feature addition, part obsolescence, and resolving design issues.

Digital twins are ideal for managing this complexity. They would give a common view across the organizational functions and geographies. Benefits will be particularly strong in the quality and regulatory assurance domains, helping drive predictive maintenance, relate quality issues to product and design decisions, turn data from market vigilance activities into insights, and rapidly and effectively resolve corrective and preventative actions (CAPA).

Use of digital twins of devices will be a significant investment, requiring changes to processes, systems, and culture. This will be most justifiable for complex systems with safety-critical functions. In recent years, device manufacturers have incurred substantial cost-of-quality issues with devices such as infusion pumps and automated external defibrillators. We can expect to see the use of digital twins emerging for systems such as active implants, surgical robots, high-end imaging systems, and complex therapy devices.

Digital Twins in Pharmaceutical Manufacturing

Pharmaceutical companies are investigating the use of digital twins to simulate aspects of drug manufacturing processes to help control yields, cost, and quality. This is being driven by the emergence of biologics.

Biologics are complex molecules that are manufactured using modified living organisms or cells, in contrast to traditional drugs, which are relatively small-molecule entities made by chemical synthesis. In addition to being more complex, biologics are more variable and less robust than traditional drugs. Although the number of approved biologics is relatively small, they account for a disproportionate amount of spending on drugs and now comprise around 40% of the development pipeline.

Biologics are typically manufactured in bioreactors using process methods that can take several weeks to complete. Digital twins have the promise of becoming a universal tool for the full life cycle of a bioprocess, evolving in parallel through the stages of process development and process validation, and through good manufacturing practice (GMP), where they can continue to be refined as new production data allows refinement of the models.

They bring the promise of better specification and fewer out-of-specification events, faster problem solving for atypical outcomes, and predicting aspects such as probable batch failure, allowing the process to be terminated early.

While the use of digital twins is strongly aligned to Quality by Design and Process Analytical Technology regulatory requirements, any adjustment to the process driven by digital twin technology will need to be within the scope of process characterization, or would be classified as a major change, requiring reapproval.¹


The effects of predictive analytics (PA) are already being felt in healthcare, and PA is set to have a substantial impact in coming years. The technology will increasingly be used to help predict the probability of future scenarios in order to make better-informed decisions. In healthcare, this means better decisions that will personalize the treatment of patients as they go through their care pathway, as well as better organizational decisions that will improve the operational efficiency of healthcare systems.

Driving improved effectiveness and efficiency in treatments and healthcare processes is going to be necessary to address the increasing pressures on healthcare systems. In developed economies, aging populations are creating a growing burden of chronic conditions and acute care demands, while emerging economies are looking to make affordable healthcare available to their underserved citizens.

Data is the lifeblood of predictive analytics. For PA to be effective, there must be access to the right data at the right time, the data must be correct and valid, and steps must be taken to ensure the privacy of sensitive data. These essential requirements mean that data has a cost, so it is important that development and deployment of PA are founded on a robust strategy that identifies the desired outcome and collects only the data necessary for the task at hand.

The development of machine learning technologies is transforming the algorithms that are used in PA. Deployment has been made easier by the availability. Over the last few years, ML toolboxes for predictive analytics have become widely available, enabling rapid prototyping and development where previously custom code solutions would have been necessary. These technologies allow modeling of more complex interactions but require more data to train, and their outputs can be harder for a human to interpret.

In patient care pathways, PA is being used to predict what conditions people will get. In chronic care, the driver is to preempt, slow down, delay, and hopefully avoid later stages of the disease, which are inherently both more debilitating for the patient and expensive to treat. In acute care settings, the focus is more on reducing the risk of complications of care, which lead to worse outcomes and higher costs.

PA is also being used to help identify the best therapy choices for patients, based on the particular characteristic of their condition and their physiology, including their genetics. And PA is being used to predict compliance and engagement of patients, particularly important in successfully managing long-term conditions.

At an organizational level, PA is being applied to drive efficiency of healthcare operations, optimizing utilization of high-value assets, such as operating rooms, and reducing unnecessary costs, including reimbursement penalties and supply-chain losses.

PA is already being implemented in healthcare, yet we are at the mere infancy of its application. Technological and organizational advances will make increasing quantities of data available that will be suitable to develop and deploy PA systems. Processing power will increase, and more off-the-shelf analytical tools will be available, which will make this development easier and commercially realizable. And we will see growing use of digital twins — digital models of physical systems and processes that are tightly coupled to their physical counterparts by continuous monitoring inputs and PA-driven feedback control outputs.


¹ S. Zobel-Roos et. al., “Accelerating Biologics Manufacturing by Modeling,” Processes (2019): 7, 94.

Learn faster. Dig deeper. See farther.

Join the O’Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

Jaquie Finn is Head of Digital Health at Cambridge Consultants and leads the initiative to provide strategic guidance and technical solutions for clients transitioning their business model to include a digital element. These digital elements include device connectivity, AI, security, launch strategies, behavioural science and digital service design.
Jaquie has previously worked on the design and launch of a compliance and work management IoT service that is in commercial use. She has a BSc in Applied Biology, 10 years’ experience in Molecular Biology and 15 years’ experience of Product Management and Marketing for software, hardware and digital services in medical and life science industries that include immunology, ophthalmology, neuroscience and bioinformatics. Jaquie is a member of the Chartered Institute of Marketing (CIM).

Dr. Gavin Troughton heads the Critical Care business at Cambridge Consultants, working with clients to identify and deliver breakthrough products and services in the medtech sector. Originally from an R&D background, he has spent more than two decades in commercial roles, taking products through the full cycle from concept to end of life. His product and technology experience spans monitoring, diagnostic and therapy devices for the acute and critical care space.



O'Reilly Media

O'Reilly Media spreads the knowledge of innovators through its books, video training, webcasts, events, and research.