AI in healthcare is finally taking off. What’s changed and are we ready yet for robot doctors?

Famed venture capitalist Vinod Khosla once remarked that “Artificial intelligence (AI) will replace 80% of doctors”. That was 2012, and today that particular statement still seems as outlandish as it has ever been. It turns out that robots are neither as easy to create; nor are doctors as easy to replace. The allure of AI applications in healthcare however, is undeniable. Faster, cheaper, and more accurate, AI has the potential to not only lower the cost of healthcare, but also eliminate the third leading cause of mortality, medical errors [1].
Behind the scenes, AI technology has evolved in leaps and bounds in the past few years. From diagnostic algorithms, to health assistants and precision medicine, we are finally starting to see practical applications of AI emerge in healthcare.
Origins of Healthcare AI
Some of the earliest AI healthcare efforts came in the form of predictive and diagnostic algorithms [2]. These range from simple rule-based algorithms based on clinical workflows, to predictive analytics based on big data from medical records, to more complex signal processing of medical imaging, sounds and other physiological data. A great example of the latter is automated EKG interpretation. The technology was first developed in the 1970s and today, most EKG monitors in clinical use can diagnose conditions from heart attacks to electrical blocks. In my own clinical experience, I found that these algorithms could often pick up signs of obscure and rare diseases that even cardiologists may miss. As EKGs have become miniaturized for consumer use in devices such as AliveCor’s Kardia, interpretation algorithms can also be found to automate detection of common anomalies, such atrial fibrillation.

While incredibly useful, these earlier efforts at healthcare AI are also quite rudimentary from a technology perspective, requiring carefully curated datasets and deep domain expertise to develop. This means that experience and learnings from creating one set of algorithms may not be extrapolated for use in developing other algorithms. Improvements in accuracy of these algorithms is also quite slow, as the oft-times manual nature of algorithm tweaking makes improvements hard to scale with more data.
We are now at an inflection point
This is all about to change, with the dawn of the new “AI Gold Rush”. According to CB Insights, investments in AI hit over $5 billion in 2016 across a range of different industrial applications. This has given us everything from video filters that can draw cat ears, to video recommendation engines and self-driving cars. It has also produced tooling, frameworks and techniques which have fundamentally reduced the barriers to producing new applications of AI.
Almost akin to voodoo magic, developers simply need to feed enough data in and the machine will learn to recognize diagnostic patterns which allow it to mimic or even supersede a human expert.
This has had a particularly big impact in healthcare, where specialized medical expertise has previously been a major prerequisite in AI development. The rise of black box techniques using deep learning has proven to be effective at analyzing physiological data. Almost akin to voodoo magic, developers simply need to feed enough data in and the machine will learn to recognize diagnostic patterns which allow it to mimic or even supersede a human expert. The very nature of these techniques [3] means that nobody can actually explain how the machine is deciphering these diagnostic patterns. It also means that development requires no expertise (although experience with similar black box models helps), and improvements can be easily made by feeding in more data.
Modern applications of AI
Arterys is an example of a recent startup applying these techniques. Its FDA-approved software can analyze cardiac MRIs within seconds (a process that used to take up to an hour), and produce accurate measurements like ejection fraction and ventricular volume. In times past, software like this would have taken decades to build and teams of medical experts.

In fact, not only are the tooling and techniques more sophisticated, developers also benefit directly from decades of general AI research. The Google Inception Engine is an AI model trained to recognize objects like “scooter” or “leopard” with astounding accuracy. It turns out that it is so good at telling images apart, that it can be tweaked to recognize medical images. For example, a team of Stanford AI researchers have successfully produced a model which can classify photos of skin lesions, outperforming board-certified dermatologists at diagnosing malignant skin cancer.
Using Inception, Stanford researchers produced a model that outperforms dermatologists at diagnosing skin cancer.
Will health assistants become the preferred AI interface?
Another major shift has been the form factor of AI applications. Traditionally, AI has been used inside of specialized interfaces such as physician assistance tools, data analytics platforms and dedicated diagnostic software. With the rise of general assistants such as Alexa, Google Assistant and Cortana, many companies [4] have started building AI assistants for healthcare which can directly interface with the consumer on a broad range of healthcare issues. Science fiction has long predicted this, with Baymax, the loveable character from Disney’s 2014 film “Big Hero 6”, representing a perfect incarnation of a healthcare assistant. Of course, we are still quite some time away from being able to say, “I have a cough” to a program and be magically diagnosed and treated for our ailment. The question is — if the technology is robust enough to power a self-driving car, what’s stopping us from achieving this in the foreseeable future?

Why is healthcare AI so hard to build?
As with real physicians, experience is crucial and AI depends on both the quality and the quantity of data that it is built on. Unfortunately, with electronic medical records only recently implemented in many places, there is still a paucity of high quality data and what exists, is often silo-ed away in clunky, non-interoperable and incomplete hospital databases. In fact, most of the AI breakthroughs in healthcare have been achieved either through laborious primary data collection studies or via clinical partnerships which have allowed technical teams to access proprietary hospital databases.
There is still a paucity of high quality data and what exists, is often silo-ed away in clunky, non-interoperable and incomplete hospital databases.
Inevitably, this has meant that progress is much slower than in other fields where data is publicly available. Open access means brilliant research teams from all over the world can compete to develop the most accurate models and discover unintended applications. For example, in human speech recognition where many different databases are readily available, Baidu, Google, Microsoft and others have raced each other to create the best software. As a result, we’ve seen word error rate drop from a barely usable 20% to under 5% in just 4 years [5].
Implementation challenges
There are also challenges in implementation. For example, regulation has yet to catch up with modern technology. The concept of “black boxes” is inherently scary to regulatory institutions like the FDA which are designed to mitigate risk (and rightfully so). Not only do AI models have to go through the same validation and procedural hurdles that were initially designed for experimental drugs, each successive improvement in the model have to undergo the same hurdles. This is a major problem, because AI models are designed to continuously improve on itself with more data and minor tweaks. Of course, the FDA isn’t standing still and they have started to show more of a willingness to adapt to digital innovations.
The concept of “black boxes” is inherently scary to regulatory institutions like the FDA which are designed to mitigate risk.
The other challenge perhaps unique to healthcare is — who pays? Or even more fundamentally, who really wants a better, more efficient healthcare system aided by AI. Probably not the hospitals and doctors who hold the key to the development and adoption of AI. Arguably, many of the proposed AI applications will reduce physician utilization and hospital revenues while at the same time, exposing patients to “risky” and “untested” methods. That’s perhaps just too many reasons for these stakeholders not to embrace change.
What about patients themselves? There is certainly an incentive to choose AI based healthcare solutions which could provide a better and potentially less costly experience. The challenge here is that the user is medically untrained and the AI model needs to be sufficiently accurate to account for the huge variability in usage behavior, e.g. a skin cancer diagnosis app needs to account for a variety of smartphone cameras, lighting conditions and lens positions.
None of these challenges are permanent, however. We’ve overcome similar obstacles in the past with the adoption of primary care model, electronic medical records and telemedicine. There is no reason to think that AI will not also eventually catch up in healthcare!
So will robots replace doctors?
While AI will no doubt eventually supersede human diagnostic and predictive capabilities, we are still many decades away from developing and then trusting software as our default healthcare companion. Even if that becomes reality, doctors will always have a critical role in the human aspects of healthcare: listening, showing compassion, and helping us make those difficult and complex healthcare decisions. Moreover, every doctor takes a Hippocratic Oath which obliges them to act in our best interests, and our relationship with our doctors is often built on a high degree of implicit trust.
It’s difficult to imagine that it would be the same with robots.
About the Author
Dr. Andrew Lin (MD) is the co-founder and CEO of CliniCloud, a healthcare technology company that specializes in connected medical devices, patient-centric software and applied artificial intelligence. CliniCloud is using deep learning techniques in automated diagnostics and voice-based healthcare interfaces in order to create the next generation consumer healthcare experience.
[1] Medical errors is estimated to cost more than 250,000 lives per year in the United States, according researchers at Johns Hopkins. (Makary MA, Daniel M., “Medical error — the third leading cause of death in the US”, BMJ 2016)
[2] The definition of AI depends on who you ask. In this article, AI is used to denote any programs which mimic human cognitive function.
[3] Deep learning inference models use many layers of cascading mathematical operations to interpret raw signals, making it virtually impossible to interpret what each operational step means.
[4] Examples include Babylon, Baidu and Microsoft.
[5] 5% word error rate is also the threshold for human accuracy.
