We’ve just wrapped up our 2019 SIIM Annual Meeting in Denver during which AI was at the center of the discussions.
I would like to reflect on two panel discussions I have interacted with on Wednesday 26th and Thursday 27th June in the Exhibition Hall Theater. The first one was about the economics of AI and the second about its current state in practice. I also had many interesting exchanges with key stakeholders of the AI and Radiology ecosystem ranging from Academia to Corporates.
The panel included a diversified group of academic faculties, entrepreneurs and industry stakeholders including startups (Infervision, Ai.doc, Qure.ai …) and established companies (Nuance, Blackford Analysis, Intelerad, Theracon, GE, Philips …). The audience was composed of physicians (mainly radiologists) as well as HealthIT and Imaging Informatics Professionals.
During the panel discussion, I jumped on the microphone and told them about my experience as an investor. During the last years, I had multiple opportunities to invest in startups that were developing deep learning-based software for medical imaging, but so far, I had not made any investment…
Despite reliable teams, promising technologies, a large market and solid domain-expert advisory boards, I was not confident enough to board on these ventures. The missing piece was the lack of a sound business case and defensibility in the long run.
The majority if not most of the startups operating in this field are focused on solving very narrow clinical problems based on limited and biased training datasets and are heavily focused on image pixels rather than healthcare’s big picture; this will refrain them from developing scalable & clinically useful products and building profitable and successful companies.
Companies that are building algorithms to detect one or few radiological abnormalities on one medical imaging modality are building features rather than products. The delineation between a feature and a product can be hard to define in the digital world.
For instance, as a radiologist, a software that allows me to highlight nodules or a pneumothorax on a chest X-ray, which I will analyse comprehensively anyway, is a feature, a nice-to-have but isn’t in anyway a must-have. Moreover and from my personal experience, it will most probably slower my workflow by inducing additional clicks and visual analysis efforts, and might divert my attention from another subtle finding.
Another example is a software that prioritises my patient case list according to the emergency of the imaging findings; it is also a nice-to-have feature but not a product. I even have doubts about its clinical utility ! Why? Because a head CT with an emergent finding such as a large hematoma with mass effect is immediately seen by the radiology technologist who performed the exam, who in turn immediately alerts the physician in charge of the patient. Not to mention the clinical condition of the patient that alerts about the urgency of looking at the images… Moreover, it isn’t the radiologist emergent reporting of such brain CT that will impact the management of the patient… it is the availability of a neurosurgeon and anaesthesiologist able to perform an emergent decompressive craniectomy. Do you really think the neurosurgeon will wait for the radiologist confirmation of an acute brain herniation before rushing to the operating room? Given these established processes, ecosystem and staffing (which is the case at least in the western world), there is no core need for such a solution.
However, the worklist prioritisation might bring an added value in high volume centers with remote and differed reading, in allowing a reduction of turn around time of small intracranial haemorrhage, bone fractures, tumors or missed ischemic strokes that would otherwise remain in the end of the reading list.
And even if clinical studies have shown a reduction in the turn around time and reporting of an intracranial bleeding, it is not enough to conclude that this is a must have product. What needs to be proven for that is that the uses of AI prioritisation or clinical decision support tools lead to a better clinical outcome for the patient and to a reduction of healthcare costs.
From another perspective, the biggest problem is actually not to flag the intracranial bleeding when the CT is performed; the most pressing and serious problem is to flag the patients who have an intracranial bleeding while they are at home or waiting in the emergency room.
What we are seeing in the entrepreneurial space today are FNAC; “Feature, not a Company”
While feature-companies could evolve into successful product-companies such as Twitter and Dropbox, I am still skeptical about this potential for current early-stage AI companies in Radiology.
It is not improbable that some of these startups, which I categorize as a no-go for an investment, might laugh at my blog post in 10 years. Indeed, humans are great at explaining and analysing events from the past but are very weak at predicting the future. Note that this is why I have not bought Homo Deus: A Brief History of Tomorrow from Yuval Noah Harari even if I have voraciously read Sapiens: A Brief History of Humankind being convinced that it is a century masterpiece.
So let’s now look at what a product is, in opposition to a feature: a product is a software (or a suite of software) that provides multiples solutions (features) for a range of clinical and non-clinical situations to complete a job and not just a task. The radiologist’s job is very diverse and ranges from supervising the realisation of an exam to its analysis and result reporting. A task is just a part of work that contributes to the execution of the job. Tasks are for instance drafting a customised protocol for an examination, opening the study, displaying images on the screens, looking for nodules in the lungs, reporting, adding key images into the report, communicating the results…
A radiology information system (RIS) is a product, the automatic flagging of exams that have not been read an hour after their completion is a feature.
Successful products solve a problem and depending on how you define the problem you aim to solve, your potential revenue could be 10 to 50 times higher and the impact on the healthcare could be transformative rather than mildly incremental…
Let’s look at some problems that physicians may like to solve:
- The long turnaround time before a radiologist analyses a head CT with an intracranial bleeding that has been missed by the technologist and the referring physician
- The missed cerebral artery occlusion by a burned-out resident in radiology
- The long waiting time in the emergency room before a patient with an intracranial bleeding undergoes a head CT
- The outlandishly high number of normal head CT performed in an emergency setting to exclude intracranial haemorrhage and discharge the patient
- The unpredictable occurrence of a stroke in a healthy individual ?
I ordered these problems on purpose with a gradual increase of the impact on healthcare, cost savings and market size, and at the same time a gradual distance from radiology pixel analysis.
To address these unquestioned problems and make healthcare great again, one should not lock his thoughts within the walls of the radiology department and rather seek relentlessly for the most meaningful problem to solve: the Killer App !
This isn’t an easy task and AI isn’t the first technology to hit a brick wall.
During another panel discussion at SIIM 2019, I raised my voice to draw a parallel between AI and robotic surgery. Robotic surgery as an innovative technology is around since 3 decades and while it seems intuitive and obvious that it could replace human surgeons because it is more precise and less invasive, it doesn’t suffer from fatigue, it makes the surgeon work less hazardously and more ergonomically and it can even allow remote surgery, it hasn’t replaced it yet and even more surprisingly many if not most of robotic surgery companies have failed … Only one has made it to the market with a global and sustainable adoption (at least until today): Intuitive Surgical Inc. Why?
Because Intuitive Surgical with its da Vinci surgical system has found the Killer App for robotic interventions; radical prostatectomy for prostate cancer. Studies have shown that robotic radical prostatectomy induces less intra-operative bleeding, less readmission rate and less serious complications such as sexual impotence or urinary incontinence. While one would initially and intuitively think that robotic surgery will rapidly transform every fields of surgery, only one has been successfully adopting this new technology to date. It is also important to note that the da Vinci surgical system has become a marketing tool for hospitals to brand themselves as safer and high tech hospital.
AI startups are currently building algorithms based on supervised learning trained on datasets mainly collected from hospitals in their geographical region, making them by definition non-representative of the real world and limited in size. These datasets are annotated by human radiologists and thus it is impossible to obtain a curated dataset in the scale of millions studies (for comparison, ImageNet includes more than 14 million hand-annotated everyday life images of which at least 1 million with bounding boxes). Even if they achieve the annotation of a large dataset for one use case, they would need to re-annotate the same images all over again for a different use case or just for a more precise pixel-level annotation! This is a very poorly efficient and unsustainable way to build algorithms…
In addition to the data bottleneck, I do not see a strong defensibility of these companies on the long run. I think that the data science part of the development of an algorithm is on its way to become a commodity. The tools used to build these algorithms are open source and available to anyone with the necessary skills and the skills can now be learned online with many free resources of high quality (fast.ai, coursera …). I predict the emergence of countless number of startups all around the world building algorithms for narrow use cases with questionable clinical utility in the real life and limited generalisation.
For all these reasons, I believe that academia and academic-backed institutes have the best chances to develop robust and clinically useful algorithms based on large, balanced and representative datasets. I am thinking specifically of Stanford AIMI and Boston based CCDS. These institutes are backed by top notch academic hospitals and have in-house multi-disciplinary teams including physicians, data scientists, product developers and business professionals. They might be less able, than a VC-backed company, to build fancy user friendly applications and to scale rapidly, but nonetheless I think they have the best odds to create the best algorithms.
During SIIM 2019, Stanford AIMI presented an interesting non-pixel application of deep learning. The algorithm they trained was intended to classify patients with a suspected pulmonary embolism (PE) in 3 distinct categories: Low, Moderate and High risk. They used commonly available demographic and clinical data such as ICD coding, vitals, inpatient and outpatient medications as input with temporal feature engineering. If the patient is classified as low risk, no imaging and no treatment would be performed. If classified as high risk, a treatment without imaging would be performed, and only if the risk is moderate, the patient would undergo a chest angio-CT to look for a PE. They tested their algorithm on a distinct dataset at Stanford and from an external medical center (Duke) and achieved a performance with an AUC > 0.81 outperforming conventional algorithms (rGeneva; AUC around 0.5).
Regarding the high proportion of chest angio-CT performed without finding a PE, it is obvious that this type of approach (before an imaging is performed) would benefit the healthcare system in a larger and more systemic way than an algorithm that supports the radiologist in finding the pulmonary embolism on an already performed chest angio-CT.
While we might intuitively think that an algorithm that detects PE on a chest angio-CT saves radiologists time and effort while increasing his accuracy, the real life experience shows an adverse vicious effect that lowers the benefit of these algorithms; a hidden consequence of the occurrence of false positives.
False positives are false alarms, normal exams flagged as abnormals by the algorithm. In this case, as a radiologist bearing the responsibility of the result, I would undoubtedly spend more time analysing this specific study in search for the absent abnormality and this would cost me a significant amount of time… that could counter balance the benefit of this algorithm gained on other true positive cases. The overall time and effort spent analysing multiple angio-CT for PE with the AI algorithm could possibly be equal or even worst than without the algorithm.
And what about the false negatives; abnormal cases classified as normal by an algorithm? An error that spares no radiologist (radiologist error rate is estimated approximately at 3% per year) but is blamed on human’s intrinsic limitations. Will we accept with resignation these errors when made by a black box algorithm?
Dr Amine Korchi is a Venture Partner at Fusion and Polytech Ventures in Switzerland. He is a neuroradiologist with additional expertise in musculoskeletal imaging and intervention, has pioneered an innovative treatment for knee osteoarthritis based on embolization and has developed an expertise in Health Technology, Innovation and Investing. Follow him on twitter @AmineKorchiMD & Medium Amine Korchi MD.