What an Academic Hospitalist thinks about ML in healthcare

A summary of my interview with an Academic Hospitalist. This is one of my 18 interviews with clinicians for my MPH capstone (link here) for UC Berkeley’s School of Public Health.

Visit my Building Trust and Adoption in Machine Learning in Healthcare site (link here) for the abridge and full version of my MPH capstone as well as upcoming summaries of interviews and additional research.

Note that this interview is a de-identified summary for ease of reading and privacy. It has been approved for distribution by the interviewee.

“My current thought is that all new clinicians should be at least somewhat aware of the technology at a bare minimum — knowing very vaguely of how it works, which use cases are better vs. worse, how human clinical judgement will be impacted, and how clinical specialties might look in the future. At the moment, I think all of this information could be included in about four short lectures. But in the future, there may need to be a significant curriculum redesign.”

Job Background

I am an Academic Hospitalist at an academic medical center. Clinically, I attend on general medicine inpatient teams of residents while also doing direct care without trainees. In the academic setting, I am an Associate Director of the residency program as well as teach and develop the curriculum. I plan to continue working indefinitely.

Familiarity with ML in healthcare

To start off, what have you heard about artificial intelligence and/or machine learning in healthcare?

Despite not living in Silicon Valley, I may still be in the ML bubble. I am generally aware of the principles of ML and the most mature applications in medicine — specifically around image recognition for specialties like dermatology, radiology, pathology, and ophthalmology. There are also applications in clinical decision support which haven’t been studied as much yet. I particularly like Eric Topol’s book, Deep Medicine, about the current state of ML in healthcare.

Past and future use

Have you used any ML tools? Would you?

I would use an ML tool, but I don’t think I have. I believe that most of the clinical decision support tools in the EHR that I use are hand-coded, non-ML tools.

Excitement and concerns

What about ML in healthcare is concerning or exciting for you? What else is exciting or concerning for you?

I am excited by many things. I see there is huge potential to increase efficiency of various tasks. For example, ML can help with image interpretation to categorize things on chest X-rays, EKGs, or mammograms. I am a Hospitalist, so a lot of my job is data ingestion, management, and analysis in my own brain. A lot of that is based on intuition, instincts, and impressions; and I think that it could be improved with ML tools. Dr. Bob Wachter once compared future clinicians taking care of patients to current day airline pilots landing an airplane. These clinicians would have similar tools to that of the cockpit displays that tell pilots right course. There could be a tool in medicine that looks at the clinical variables and tells me the trajectory of similar past patients. That would be very helpful.

My concerns are that the algorithms are only as good as the data and gold standard labels that go into them. The fact of the matter is that healthcare data are extremely messy and full of errors, leading astray all of these ML algorithms. It’s a garbage in, garbage out issue. There could be human bias in the data that then gets unleashed by an ML tool. I also am concerned about over reliance on these tools and loss in the quality of human clinical judgement. Lastly, there is a huge potential to widen health disparities since so much data include direct or indirect values for race, gender, and socioeconomic status.

Ethics and privacy

Where do ethics play into this? What could go wrong, or what could be done well?

There is much potential to help or harm patients, and that is where the ethical issues arise. The key ethical goal in healthcare is to do more good than harm to a patient. So that is the key question that we need to ask when evaluating each tool. I am concerned about the idea of these black boxes, since we don’t know how the outputs are generated. Like I mentioned before, there is also a big question mark around how ML will widen or narrow health disparities. It could possibly increase access to care for underserved populations, yet it could also automate inequity.

ML knowledge and model explainability

At what level do you need to understand how the model makes its prediction?

It is hard to say. From what I understand, the issue is that we don’t exactly know how many of these algorithms work. If a human messes up a finding on an X-ray, then there is a remediation path to make the next time better. However, the same doesn’t seem to be true for these ML algorithms. Even if it is 99% accurate and that is much better than a human clinician, we still need to find a way to learn more about the 1% of errors. I personally don’t need to understand the algorithms and math.

However, as I say all of this, it may not be relevant. Better outcomes are better outcomes. But I guess this unknown will make others uncomfortable.

To be honest when push comes to shove, ML may not be much different than ultrasounds or MRIs. I know other clinicians who don’t know how those technologies work; they just know how to review the images.

I guess the central difference is that these other technologies — like lab tests, chest X-rays, ultrasounds, and MRIs — give clinicians an increased sense of certainty. There is a perceived objectively in a lab value. But when you are talking about ML algorithms where there is no previous understanding or human control in the process, that creates an increased amount of uncertainty.

External validation needs

For you to be willing to use an ML tool, what external validation would you need to see?

I would need to see various validation studies. The most important thing to me would be a head to head comparison with the gold standard diagnosis, for example the average diagnosis by a set of radiologists. But this all depends on the situation and data inputs. For image classification, the pixels are there and don’t lie. However, for things that are less objective like ranking healthcare need, a lot of pressure testing is important before I am comfortable using it. In all situations, I think there should be a human in the loop.

Clinical education

How would clinical education be impacted?

I was actually wondering that myself, since I am on my academic medical center’s curriculum committee and am working on designing updates that account for a future of ML in medicine. My current thought is that all new clinicians should be at least somewhat aware of the technology at a bare minimum — knowing very vaguely of how it works, which use cases are better vs. worse, how human clinical judgement will be impacted, and how clinical specialties might look in the future. At the moment, I think all of this information could be included in about four short lectures. But in the future, there may need to be a significant curriculum redesign. In short, medical education should evolve to include some information about ML and these needs are evolving rapidly.

Implementation

When an ML tool gets implemented, how should that be done? Who should have access first; who should not?

I think there needs to be convincing validation as a first step. Then when it is fully implemented, there needs to be continued surveillance of performance. For example, these tools may be built in another setting, so we need to make sure they work at our place. I also think there needs to be significant consideration on how much clinician agency there is to use or not use the tool. Usage and results need to be tracked to understand if these things really work.

--

--

Harry Goldberg
Building Trust and Adoption in Machine Learning in Healthcare

Beyond healthcare ML research, I spend time as a UC Berkeley MBA/MPH, WEF Global Shaper, Instant Pot & sous vide lover, yoga & meditation follower, and fiance.