When the Artificial Intelligence served us as doctors

wang xiangning
Nov 5 · 10 min read

The presuppositions of life, disease and healthare are far away from machine thinking.

The Professor Natalia Trayanova from Johns Hopkins Medical School has gone through a tough year. Her team of computational cardiovascular medicine techniques, sponsored by the National Institutes of Health (NIH), has published more than 50 papers in the past three years. However, a lot of difficulties were encountered, as she embarked on the advancement of “unprecedented solution” in the field to clinical applications.

Natalia Traynanova | TED JHU

Pioneering Cardiovascular Engineering

The primary task of Traynanova’s project is to treat atrial defibrillation. The atrial defibrillation is subject to the heart does not beat cyclically and regularly, but in a quickly and irregularly way. Sometimes, people cannot even notice the intermittent atrial defibrillation, but the persistent atrial defibrillation(AF) can take away a person’s life in a few minutes. You may have seen the instruments put in a red box labelled with the words “AED” and lighting mark in a television production. This is an automated electric defibrillator. These devices have saved many people with sudden heart attacks by generating electricity.

However, the whole therapy operated by these devices are sometimes too late to save lifes. Then the medical researchers started to develop a cardiac defibrillation surgery and found out the tiny myocardial fibers will cause arrhythmias which should be removed in the first stage. But the trouble is that these tiny myocardial fibers are difficult to find, the surgeries largely rely on the doctor’s professional experience, often incorporated with some accidental electrical misfiring.

Natalia Trajanova/ Youtube

Then, the Trajanova Lab focused to innovate a individualized cardiovascular imaging protocol that combines predictive modelling and Artificial Intelligence(AI) for heart disorder. It allows to produce holographic 3D simulation of the heart ventricles, reconstruct each bundle of myocardial fibers, simulate cardiac dynamics, and identify patients at high risk of cardiac arrest and treatment of cardiovascular dysfunction in the first surgical attempt.

The simulated heart from Traynanova’s Lab

Firstly, the patient with AF will go through the contrast-enhanced MRI heart scans, which document any scarring on the heart. Next, the engineers segment the images into a geometric representation of the atria. In the algorithm system, the representation with virtual heart cells will be populated. Then, virtual heart will be identified to see how it will react by observing the simulated image and marking down the irregular heartbeat sites. The ablation will be taken to destroy the lesions while the virtual surgery and potential misfiring will also be tested.

This success is an exciting example of how engineering technology can be used in the clinic to help make treatment more accurate and spare patients from multiple costly and sometimes risky procedures” said by Natalia. Meanwhile, in a few years of recent, the accelerating development of AI technology also contributes its function to innovate the cardiovascular engineering.

However, when the project is implemented in practical application, the optimism of prospects has decreased. The actual needs of patients can not always be combined with technical design. Traynanova had to communicate with doctors and engineers repeatedly. Also, the bigger challenge comes from the US Food and Drug Administration (FDA)to test whether the research results can be converted into approval standards, regardless of how many essays are publised.

With the enormous enhancement of AI computing capacity, some optimists believe the AI will take over the hospital is just a matter of time, but the road from the laboratory to the hospital is very difficult.

Can AI make an independent diagnosis of hypochondriasis?

People need to have a clear understanding of what a machine can and cannot do. At present, the main achievement of AI is to lay the foundation for the judgement of human doctors, rather than making judgements on themselves. For example, one of the work that Jeffrey Siewerdeson did was to use the machine to learn the characteristics of high-precision pictures, and then “calculate” the low-resolution pictures into high-definition pictures, in other words, is to de-mosaic. The incorporation of AI machine learning can sometimes help doctors to process real-time observation of low-resolution picturs.

Indeed, image recognition is one of the best things that AI currently doing. Probably from 2013, AI’s ability in this field has begun to develop rapidly. In 2015, the machine trained in Google ImageNet database has more face recognition capabilities than humans. This is due to the fact that the technology can consume a large amount of image data in a relatively short period of time, and learn through various criteria of deep neutral networks, then become an experienced “doctor” by unsupervised machine-learning(ML) algorithms. Professor Siewerdeson and Professor Traynanova use these characteristics of AI to provide proper diagnostic suggestions and assist doctors to see or judge more clearly.

For some specific diseases, it is not so difficult to let AI see the image and make corresponding judgements. For example, Neil Bressler, the professor of phathalmology is using AI technology to approach the pathological changes of diabetic eye-ground. The rich volume of database accumulates due to a large number of cases of diabetes, so the determination of the lesion is relatively simple and the current technology has a relatively mature application scenario. However, there are some aspects such as cancer, tumors, etc. are far more difficult to reach, the image patterns are very complex and difficult to generalize with one or several mechanical models. Machines often stuck in the stages which require the human analogy to make the judgements. Some lesions are also very rare, and it is impossible to form a trustworthy database, in other words, it is not yet possible to train AI like a real doctor.

The fundamental contradiction is still behind: Even if the database is large enough and the computing power is strong enough, can AI replace human judgement?

Does human trust machines?

In December 2011, a hospital in Massachusetts. USA. There was a elderly man had fainted and suffered a seizure, was immediately placed in the emergency ward and placed on the surveillance equipment — if his vital signs were dangerously fluctuating, the device would issue a warning to summong the nurses. In this way, the nurse does not have to come over to check the situation from time to time.

The next day, the old man died in the hospital bed, despite the red light of the monitoring device flashed overnight before the death. However, the nurses become desensitized to the constant blaring of monitor warnings, may of them false alarms — a phenomenon called alarm fatigue.

Such an automated system treats tiny little fluctuations as a risk, but if you ignore a risk, the responsibility triggered could be massive. This leads manufacturers to produce the machines “too sensitive” — overfitting and generate a series of false alarms. Through public records requests, the Globe found at least 11 deaths in Massachusetts since 2005 linked specifically to lack of response, or inadequate response, to alarms on cardiac monitors in hospitals.

Science fiction tends to describes human-machine suspicion as an unreasonable act of no reason or even a source of disaster, but in reality such mistrust is actually justified: people and machines do not have the identical decision-making method. If a simple automated system can monitor the patient’s heart rate, and alarms below a certain value. However, different patients’ resting heart rates are different, the dangerous low heart rate for a normal person may be only slightly abnormal for professional athletes. Traditional automation systems can only act within pre-established criteria, and there is nothing that can be done beyond the criteria.

The deep learning of AI technology is expected to break the limitation, but it will encounter a whole new set of problems. A recently FDA-approved diagnostic platform called “WAVE” is able to synthesize various patient’s physical indicators and can, through a deep learning algorithm, provide a prediction of “when the patients will enter a critical stage”. However, a review article in Science pointed out that unlike drugs or other medical devices, machine learning as a kernal algorithm is not a logically determined system that covers thousands of interlinked indicators. Indicators will also lead to different effects depending on the training data. Whether there is a clear and convincing causal link is hard to tell.

However, the practice of medicine requires most the stable and repeatable evidence to support. The mechanism and principle of drug effects demands a large number of animal experiments to clarify the specific relationship between compound and bacteria, organs, and nerves. However, the current mainstream of deep-learning technology is an opaque “black box” that eats in data and spits out results. It is difficult to track the supporting evidence in this blurred way, also coupled with the core shortcoming of machine learning- the data itself is uncertain. They quesand the problem of the universality and repeatability of AI.

At the annual meeting of the American Association for the Advancement of Science (AAAS) in Washington.DC, in February 2019, Genevera Allen, a professor of data science at Rice University, used a series of examples to strike the heart of the problem. At present, many teams have worked on the cancer-related genes, by inputing the genome and case data of cancer patients, to analyze several different sub-types by machine learning and develop durgs on this basis. This is also a successful precedent for breast cancer. According to different gene expression, breast cancer can be divided into more than 10 subtypes, each with different treatment options and prognosis. But can this model fit all cancers? By “feeding” a lot of data to the machine, can the machine really rely on the data model to give a reliable classification?

Genevera Allen/ EurekAlert!

Dr. Allen combined some of the findings and found out the algorithms that performed well on a sample’s data did not necessarily apply to all situations and could not be repeated. The diagnosis and treatment opinions based on this classification are naturally meaningless. “The two teams use different data and are likely to get subtypes that don’t overlap,” Allen said in the meeting report. “Do these ‘discoveries’ really have scientific value? Is there any reliable medical evidence behind them”

Dr. Allen said that if this development continues, medical science is likely to fall into a “ reproducibility crisis”. Although it is a bit pessimistic, but not unreasonable. Of course, this is not to say that human doctors will not make mistakes, but in the face of mistakes, the evidence-based basis of medical diagnosis can provide us with sufficient conditions for re-discovery and seek ways to avoid it. On the one hand, the computational power and the continuously optimized algorithms in the field of AI have significantly innovated, while the other side is the cautiousness of evidence in clinical medicine. At the same time of cross-cutting and dialogue in different disciplines, whether the same language system is used on both sides becomes the key to solving the problem.

AI must meet the medical standards

Along with the widespread application of big data and AI, human doctors must also begin to understand how to plan data, even if it is not programmed, you must understand the principles. “Data science is like another language or several languages,” said Can Na, a researcher at the Sangge Institute at the Wellcome Foundation in the UK, in an inverview with Mosaic Science, who talked about the biology and medicine. The voice of the researchers. “I have to convert the biochemical pathways and flow charts from my brain into programming code.”

To some extent, programming and data have become one of the most important capabilities in the medical field. However, there are some differences in the discipline logic and evaluation criteria in the computer and medical fields. Trajanova said “ There are now too may people are fascinated by the advancement of technical algorithms, but the ultimate influential effect in the field is questionable.”

“Most of the existing algorithms, including diagnosis and prediction, are not developed under the traditional medical paradigm. They cannot directly reflect the indicators required by medicine. Even if some applications have alreadly been applied. reliability, applicability, etc. Further verifications are needed.” said by Ravi Parikh, a blood and oncologist at the University of Pennsylvania School of Medicine. The commentary he published in the Science Journal talked about this problem: many of the current medical AI research uses indicaotrs such as computing power, response speed and probability distribution curve. But what does this mean in the clinic? How much gain does this have on the patient’s treatment? The speed has increased, but the rate of misdiagnosis is unsure. These so-called “endpoints” are indicators of medical concern and the evidence for regulators to validate a technology.

Moreover, we must face the limitations of the algorithm honestly. All drugs have side effects and suitable populations. Similarly, people who study AI must jump out of the thinking of “using algorithms to solve universal problems”, and paying attention to application scenarios, data sources and data quality, cautiously using the medical language, etc. The regulatory systm must also face some key challenges — for example, how to ensure data diversity, how to open the “black box” of machine learning, and determine the link between the specific principles of an algorithm and medical evidnce. “What can be done now is to establish a complete auditing system, to track the relationship between the algorithm and the data, and possible data biases,” Parrick said.

Conclusion

The training, experience and observation of a doctor is still the most important matter during the diagnosis and treatment. What doctors have seen and understood for many years cannot be completely translated into data and become the materials supporting machine learning. AI still needs human judgements to understand data and choose the right analytic techniques. Even the best technology can only enhance the knowledge and ability but not the substitutes of human doctors. “Doctors + Algorithms” is far more meaningful than talking about how to replace, or who is better than who.

Reference:

  1. Tenner, E. (2018). The Efficiency Paradox: What Big Data Can’t Do. Knopf.
  2. Giger, M. L. (2018). Machine learning in medical imaging. Journal of the American College of Radiology, 15(3), 512–520.
  3. Parikh, Ravi B., Ziad Obermeyer, and Amol S. Navathe. “Regulation of predictive analytics in medicine.” Science 363.6429 (2019): 810–812.
  4. Razzak, Muhammad Imran, Saeeda Naz, and Ahmad Zaib. “Deep learning for medical image processing: Overview, challenges and the future.” Classification in BioApps. Springer, Cham, 2018. 323–350.
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade