Why You Still Need a Real Doctor

Despite AI like ChatGPT

Tom Kane
Plainly Put
2 min readMay 6, 2024

--

Doctor examining patient
Image by Nightcafe

Like everyone else, I was excited to read in the popular press that Artificial Intelligence (AI) could soon be used to do away with GPs, or at least replace the current methods of assessing and confirming certain medical conditions, thus relieving the workload on our medics

You’ve probably heard a lot about AI like ChatGPT being able to do almost anything, including passing medical exams. But a new study shows we’re not ready to replace human doctors with AI just yet when it comes to serious issues like assessing heart attack risk.

Researchers from Washington State University tested how well the latest version of ChatGPT could assess simulated patients with chest pains to determine their likelihood of having a heart attack. They compared ChatGPT’s assessments to the methods real doctors use, like the TIMI and HEART scoring systems that factor in things like age, medical history, EKG results, etc.

So what happened? Overall, ChatGPT’s risk scores matched up pretty well with those doctor-approved methods. So in theory, AI assistance could potentially help doctors make these evaluations faster.

The bad news? When the researchers fed ChatGPT the exact same patient details multiple times, it kept giving different risk level conclusions. Almost half the time for TIMI/HEART patients and over 40% of the time for more complex cases, ChatGPT just flat out disagreed with itself on the risk scoring.

That’s a huge problem because you need consistency when making life-or-death medical decisions about someone’s care, and the lead researcher said ChatGPT was “not acting in a consistent manner” and would judge the same patient as low-risk one time and high-risk the next time.

This really is no surprise, because it’s known that present AI systems “hallucinate” at times, and will make up an answer if the don’t know the correct one.

Part of the issue seems to be that ChatGPT has some randomization built-in to make its responses sound more natural and human-like. That variability is great for general chitchat, but not so great when you absolutely need reliable, consistent diagnoses.

It also sometimes recommended inappropriate tests like an invasive endoscopy when a simple initial test would have been the proper first step.

References: https://studyfinds.org/chatgpt-heart-attack-risk

Subscribe to get my free weekly Newsletter on Substack. All things poetry — discussion, hints and tips, tuition etc from Medium’s top poetry writer.

--

--

Tom Kane
Plainly Put

Retired Biochemist, Premium Ghostwriter, Top Medium Writer,Editor of Plainly Put and Poetry Genius publications on Medium