Building Trust and Adoption in Machine Learning in Healthcare: What Clinicians Say Matters (Full Text)

Insights from interviews with 18 Clinicians and recommendations for Product Managers and ML Engineers / Researchers, the full text of my MPH capstone for UC Berkeley’s School of Public Health

Below is the full text of my Spring 2020 MPH capstone for UC Berkeley’s School of Public Health. Visit the Building Trust and Adoption in Machine Learning in Healthcare site (link here) for an abridged version as well as upcoming summaries of interviews and additional research.

Thank you to Dr. Ziad Obermeyer and Vince Law for being my readers and guiding me through this work.

Background on Trust and Adoption in ML in Healthcare

The term “Artificial Intelligence” (“AI”) was first coined by John McCarthy during a conference held at Dartmouth in 1956. There, he and his colleagues defined AI as the “science and engineering of making intelligent machines, especially intelligent computer programs” (“What Is AI?,” 2014). During that time, those individuals expected AI to transform the world someday. Well, as William Gibson said, “the future is already here — it’s just not very evenly distributed” (Gibson & Brin, 2018). In healthcare, the terms AI and machine learning (“ML”) are omnipresent and approaching the peak of the Gartner Hype Cycle. In fact, Gartner found that AI is the “most powerful and pervasive technology-based capability across care delivery, operations, and administrative activities” (Craft & Jones, 2019). However, despite all the hype in popular media and academic literature, there does not seem to be broad adoption in clinical practice.

Despite hype having negative connotations of over-inflated interests, there does appear to be genuine enthusiasm and even a sense of growing maturity in the space. Guides and advice have been shared on how to read and critically review ML healthcare literature (Chen et al., 2019; Faes, Livia et al., 2009; Y. Liu et al., 2019). Furthermore, professional societies are releasing significant reports on the technology’s real potential, ML tool developers are making it easy for clinicians to build models themselves, standards organizations are drafting standards and definitions, and private investment is growing (Abid et al., 2020; CB Insights Research, 2020; Consumer Technology Association, 2020; Faes et al., 2019; Goecks et al., 2020; Matheny et al., 2020). In 2019, private investors paid $4 billion, across 367 deals, into companies working on ML in healthcare, and there were 17 mergers and acquisitions and plus 2 IPOs. Maturity can also be understood by the number of FDA cleared algorithms and those in clinical trials. As of mid-2019, more than 30 ML algorithms were cleared by the FDA and over 300 appeared to be in clinical trials (Liu et al., 2019). Today, in Spring 2020, The Medical Futurist has tracked 64 clearances (The Medical Futurist, 2020).

However, despite the increasing maturity, one may still see barriers to broad adoption. There are significant questions around liability that both our cultural norms and legal system have yet to figure out (Price et al., 2019). Bias has proven to be harder to address than originally thought, with well-intentioned companies building what popular media labels as “racist algorithms” (Obermeyer et al., 2019). Adversarial attacks can trick algorithms into making incorrect predictions that no human expert would make (Finlayson et al., 2019). And amongst many other challenges, there has been a gap between model performance and demonstrated clinical effectiveness for multiple different sites (Kelly et al., 2019).

The broad field of AI has gone through multiple “Winters” due to over promising and under delivering; and today, we are in a “Spring” partially due to improvements in core methodologies, data availability, and computational power (Cruz & Treisman, 2020). With these feasibility improvements, the field must now wrestle with achieving desirability. To keep with the analogy: to keep the flowers blooming during this Spring in the healthcare industry, key stakeholders must trust and adopt ML. Therefore, the goal of this paper was to gain a better understanding of how clinicians perceive ML in healthcare and to explore what ML tool developers, such as product managers and ML engineers, can do to build trust and adoption. The hope is that, as Dr. Eric Topol states in his book Deep Medicine, ML will unlock time and mental space for clinicians to focus on the human aspects of healthcare (Topol, 2009).

This paper aims to build on previous research on frontline US clinicians’ knowledge and feelings toward ML tools (Blease et al., 2019; Fernandez Garcia et al., 2020; Kennedy & Gallego, 2019; Pinto dos Santos et al., 2019; Polesie et al., n.d.; Sarwar et al., 2019). Yet, it goes beyond broad surveys and focuses on in-depth interviews with clinicians from diverse backgrounds. These frontline clinician perspectives complement insights from clinical administrators, ML tool developers / researchers, and patients (Craft & Singh, 2020; Marx, 2019; Nelson et al., 2020). Yet, with clinicians being the primary users of many emerging use cases of ML in clinical healthcare delivery, their voices should be heard and aspects of the technology that make them excited and concerned should be understood.

Primary and Secondary Research Methods

Clinicians interviews

Clinicians were sought out with the intention of gathering diverse viewpoints. Inclusion criteria can be found in Figure 1, with some case-by-case exceptions.

Interviews occurred during a seven-week period, starting early February 2020, and were based on a consistent interview guide (see Exhibit A at the end of this document). Interviews were scoped to 30 minutes by phone and often ran longer. The author both led the interviews and took notes.

The goal is to further the knowledge base of clinicians’ perceptions of ML in healthcare in an open source philosophy. Therefore after each interview, notes were summarized for later public distribution (e.g., Medium), pending approval from each interviewee. Name, location, gender, employer, and other identifying information were removed and language was standardized for ease of reading and consistency.

All raw, un-summarized notes were collated into one document, allowing for all responses to be reviewed by question. Key themes and insights were gathered through an iterative approach with supporting quotations appended to each insight.

Secondary research

Secondary research was conducted to both provide a background on trust and adoption of ML in healthcare and to validate individual perspectives of clinicians.

Journal articles were sourced from the most influential medical journals according to Wikipedia reference frequency, the Doctor Penguin weekly newsletter, and leading Twitter accounts (Chen et al., n.d.; Jemielniak et al., 2019). A list of the secondary sources with links to their websites can be found in Exhibit B at the end of this document. Given the fast pace of the space, priority was given for articles published January 1, 2019 through Spring 2020.

Articles from popular media were sourced from a regular review of email newsletters to which the author subscribes. These can be found in Exhibit B. A limited number of these articles were directly sourced in this paper, yet many more provided an underlying understanding of the space.

Zotero was used for citation management.

Research Results

By the end of seven weeks, 18 clinicians were interviewed. This was just before the COVID-19 pandemic officially shut down parts of the country. Clinicians had diverse backgrounds and could speak to a range of topics (see Figure 2).

Key themes naturally arose based on the structure of the interview guide, and the Discussion section is organized as such (see Figure 3).

Discussion of Key Themes

1. Familiarity with ML in healthcare

“I hear things in the media about ML in healthcare. I hear it could be a great tool. There is a lot of potential for good but also harm”

“But when you get into AI completely taking over what I do, I just don’t buy it. AI is a fluffy word that isn’t true. My gut says that we aren’t even close to that kind of AI”

Clinicians had a wide range of awareness of ML in healthcare that was not localized to their geography nor tenure. Those who were more familiar had specialized training or administrative roles. Most of the learnings came from popular media, leading clinicians to be attuned to the certain amount of hype in the space. This concern around hype is shared by other clinical researchers (Emanuel & Wachter, 2019). However, most were not concerned about near-term significant impacts to their jobs, since they believed that healthcare fundamentally involves a human caring for a human.

2. Past and future use

“I would use an ML tool, but I don’t think I have.”

“Much of my time, up to eight hours per week, is spent on detailed interpretation and labeling images. If I had a reliable ML tool, I could better focus on acute issues of my patients or treat more people.”

Clinicians broadly had not used, or at least think they had not used, ML tools in their clinical practice. However, they were familiar with triaging tools, which may utilize ML. Those who think they have used ML tools appreciated reductions in errors but also felt pressure to trust the predictions. Looking forward, clinicians were mostly open to using ML tools, so long as they could trust them and still “be in-the-loop.”

3. Excitement and concern

In many ways, clinician sentiment regarding ML in healthcare was positive. They were hopeful that advances in ML and its adoption in healthcare will lead to better outcomes for their patients, a more enjoyable culture, fewer healthcare disparities, and IT improvements. However, they did have several concerns, when asked directly. They thought clinicians are not equipped today to use ML tools, that data quality is a major issue, and that the ML industry is too new to be used in the clinical setting. Looking forward, clinicians were concerned about the future of the profession and unintended ethical dilemmas. Figure 4 includes the key categories of excitement and concern shared by interviewees with select quotes. These concerns are aligned with what other clinical researchers see as key issues to broad adoption (He et al., 2019).

4. Ethics and privacy

“I don’t know if I have really thought about it all that much.”

“We need to make sure that we don’t reinforce biases in the models that we train and deploy.”

When asked to explore ethical considerations further, there were mixed viewpoints. Many did not feel equipped to discuss it, others did not think it was their place to think about ethics, lastly a few had well-formed opinions on the matter. Questions around bias were the most common ethical concerns. There was a thought that much of the bias could originate from the data used to train the algorithms. Also, there were concerns around what use cases could lead to ethical dilemmas, such as end of life care, use for specific demographic / socio-economic statuses, insurance company utilization management, and general for-profit interests.

“Privacy is not a specific issue for ML; it is an issue for all digital technologies, and I have been wrestling with it my entire career.”

Clinicians pointed to HIPAA being the best legal framework to manage privacy concerns, and the rest of their privacy considerations were in relation to non-healthcare Tech companies. They felt that these large and small non-healthcare Tech companies already had access to everyone’s data, so they were not as concerned about privacy or at least did not feel that the world was equipped to address it.

“I have deferred my ethical questions to others. There are bioethicists who we work with who do the important thinking on this.”

In general, each clinician found ethics to be important, but were not equipped to evaluate ML tools for ethical and privacy standards. Without a strong understanding of ML nor ethical frameworks, clinicians pointed to others for help evaluating ethical and privacy concerns. Clinicians, especially those at larger healthcare organizations, relied on healthcare administrators tasked with figuring out these issues.

5. ML knowledge and model explainability

“The only things that I need to know are the ML tool’s sensitivity and specificity and maybe its reproducibility and reliability. That is probably it. Can I trust it with that? Sure. I don’t need to know the math behind the models.”

“I know how to use the outputs of X-ray machines, CT scanners, and MRI machines to help my patients. And as I think of it, I only have a very rudimentary understanding of how those machines actually work; more of an intuition versus expertise. So maybe for an ML tool, I don’t need to know as much.”

No clinicians felt the need to have more than a basic level of understanding of ML, similar to their basic level of understanding of how other devices work, such as X-ray machines, CT scanners, and MRI machines. Instead, they wanted to know that the algorithms used representative data and had strong performance metrics — the same ones used to evaluate diagnostics. But even with these metrics, they worried if they had an ML tool, that there would be unanticipated risks and no way to understand errors when they occur. There have been many calls for more explainable models in the healthcare setting, yet limited perspectives as to how ML will impact medical malpractice (Price et al., 2019; Wang et al., 2020).

6. External validation needs

“I would need to see some sort of clinical trial in which an ML tool was deployed, and I get that it is hard to do a trial like this. I would also want to see findings on if there are improvements in the quality of care, clinic throughput, and clinician and patient quality of life.”

Clinicians wanted external validation from the FDA before using ML tools. While they preferred gold standard randomized controlled trials (“RCTs”), today, there are very few RCTs and prospective non-RCTs for deep learning models (Nagendran et al., 2020). Yet, clinicians understood that some use cases will need to pass with less rigorous clinical evidence. Many clinicians considered other healthcare providers or professional societies as signs for external validation, or they looked to both their senior and junior colleagues for advice. External validation was used to build trust in these ML tools along with legal safety.

7. Future of clinical education

“I think it is a bad idea for young clinicians to use ML. There are a lot of subtleties that exist, and if you use ML, then you don’t get the knowledge nor art form of medicine.”

“My current thought is that all new clinicians should be at least somewhat aware of the technology at a bare minimum — knowing very vaguely of how it works, which use cases are better vs. worse, how human clinical judgement will be impacted, and how clinical specialties might look in the future. At the moment, I think all of this information could be included in about four short lectures. But in the future, there may need to be a significant curriculum redesign.”

Despite expressing a need for general clinical education reform, many clinicians predicted clinical education will teach those — from early trainees to late career practitioners — to be more data-literate. Today, an ML primer was all that they wanted; but tomorrow, clinicians expected ML to be infused in all parts of clinical education. Work is being done today to begin including ML into clinical education as well as to think about longer term implications (Park et al., 2019; Rampton et al., 2020). Conversely, some thought that ML tools should be withheld early, so as to not diminish teaching clinical judgement. Overall, clinicians did not see ML as automating them away but do see it as driving meaningful change to their profession.

8. Desired use cases

Clinicians were excited to think of ways in which ML might be applied to their clinical specialty and others’. While they were not told what is possible, many had ideas about what needs exist today. These ideas spanned population health triage, screening and diagnostics, clinical decision support, quality improvement, data enhancement, and other topics. Figure 5 includes the use cases that interviewees identified.

9. Implementation

“If you are going to create some ML tool, then it has to integrate with the EHR — that is the center of my life and where all of my patient data reside. Every clinician in America uses an EHR. So if an ML tool doesn’t integrate with the EHR and is a separate window, then I don’t know how anyone will use it.”

“I think there needs to be convincing validation as a first step. Then when it is fully implemented, there needs to be continued surveillance of performance. For example, these tools may be built in another setting, so we need to make sure they work at our place.”

Clinicians wanted to make sure that implementations did not leave an unproven ML tool in a less-trained clinician’s hands and did not add complication to their existing workflows. Those with administrative and business backgrounds knew the importance of a phased implementation as well as a systems-thinking mindset, such as considering what clinical delivery resources were available downstream from any automation source. They wanted to make sure clinicians were involved in implementations yet struggled with the short-term distraction away from care delivery.

10. Buying process

“It has to have a very clear value proposition. What is the ROI going to be? Will there be a meaningful return? These companies tell me how we will practice better, but I also need to know how we will save or make money. Sadly, the system doesn’t incentivize us to do better, it incentivizes us to work faster.”

“It is hard for me to respond to a cold email from an unknown ML startup. I am more excited about Big Tech brands that I know and trust.”

Clinicians explained that the buying process for an ML tool would be complicated, with many stakeholders involved in decision making. These stakeholders appeared to differ in every setting as did their proof points for coming to a buying decision (see Figure 6). ROI was the most common value proposition, yet it needed to be positioned differently when speaking with a provider in a fee-for-service world versus one in value based care. They were sensitive to the recent history of burnout from EHRs, and trusted established technology brands over less-known startups. Smaller providers said they would struggle to bet on new technology.

Conclusion

As ML advances and becomes more common in the healthcare setting, product managers and ML engineers / researchers must better understand clinicians’ viewpoints to build trust and adoption. Looking forward, the author will continue his research, gathering and analyzing the perspectives of these ML tool developers and publishing additional findings.

References

Abid, A., Abdalla, A., Abid, A., Khan, D., Alfozan, A., & Zou, J. (2020). An online platform for interactive feedback in biomedical machine learning. Nature Machine Intelligence, 2(2), 86–88. https://doi.org/10.1038/s42256-020-0147-8

Allergist. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

American College of Radiology Data Science Institute. (n.d.). Define-AI Directory. Retrieved April 13, 2020, from https://www.acrdsi.org/DSI-Services/Define-AI

Blease, C., Kaptchuk, T. J., Bernstein, M. H., Mandl, K. D., Halamka, J. D., & DesRoches, C. M. (2019). Artificial Intelligence and the Future of Primary Care: Exploratory Qualitative Study of UK General Practitioners’ Views. Journal of Medical Internet Research, 21(3), e12802. https://doi.org/10.2196/12802

Cardiologist. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

CB Insights Research. (2020). AI In Numbers Q1’20: Global Funding, Corporate Activity, Partnerships, And R&D Trends. https://www.cbinsights.com/research/report/ai-in-numbers-q1-2020/

Chen, E., Rajpurkar, P., Topol, E., & Ng, A. (n.d.). Doctor Penguin. Retrieved April 15, 2020, from http://doctorpenguin.com

Consumer Technology Association. (2020). Definitions/Characteristics of Artificial Intelligence in Health Care (ANSI/CTA-2089.1) (p. 32). Consumer Technology Association. https://shop.cta.tech/products/definitions-characteristics-of-ai-in-health-care

Craft, L. (2019a, February 25). Healthcare Provider CIOs: Get Ahead of AI Innovation With Strong AI Governance. Gartner. https://www.gartner.com/document/3902977

Craft, L. (2019b, August 2). Understand the Value of AI for Healthcare Delivery Organizations. Gartner. https://www.gartner.com/document/3869974

Craft, L., & Jones, M. (2019). Hype Cycle for Healthcare Providers, 2019. Gartner. https://www.gartner.com/document/3953717

Craft, L., & Singh, P. (2020). State of AI — Healthcare Providers’ Perspective. Gartner. https://www.gartner.com/document/3979188

Cruz, L. P., & Treisman, D. (2020). Making AI Great Again: Keeping the AI Spring. 144–151. http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0006896001440151

Cutillo, C. M., Sharma, K. R., Foschini, L., Kundu, S., Mackintosh, M., & Mandl, K. D. (2020). Machine intelligence in healthcare — Perspectives on trustworthiness, explainability, usability, and transparency. Npj Digital Medicine, 3(1), 1–5. https://doi.org/10.1038/s41746-020-0254-2

Dermatologist. (2020, February). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Emanuel, E. J., & Wachter, R. M. (2019). Artificial Intelligence in Health Care: Will the Value Match the Hype? JAMA, 321(23), 2281–2282. https://doi.org/10.1001/jama.2019.4914

Emergency Medicine Physician. (2020, February). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Emergency Medicine Physician. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Endocrinologist. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Faes, L., Wagner, S. K., Fu, D. J., Liu, X., Korot, E., Ledsam, J. R., Back, T., Chopra, R., Pontikos, N., Kern, C., Moraes, G., Schmid, M. K., Sim, D., Balaskas, K., Bachmann, L. M., Denniston, A. K., & Keane, P. A. (2019). Automated deep learning design for medical image classification by health-care professionals with no coding experience: A feasibility study. The Lancet Digital Health, 1(5), e232–e242. https://doi.org/10.1016/S2589-7500(19)30108-6

Fernandez Garcia, J., Spatharou, A., Hieronimus, S., Beck, J.-P., & Jenkins, J. (2020). Transforming healthcare with AI: The impact on the healthcare workforce and organisations (p. 134). EIT Health, McKinsey & Company. https://eithealth.eu/our-impact/our-reports/report-transforming-healthcare-with-ai/

Finlayson, S. G., Bowers, J. D., Ito, J., Zittrain, J. L., Beam, A. L., & Kohane, I. S. (2019). Adversarial attacks on medical machine learning. Science, 363(6433), 1287–1289. https://doi.org/10.1126/science.aaw4399

Gastroenterologist. (2020, February). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Geis, J. R., Brady, A. P., Wu, C. C., Spencer, J., Ranschaert, E., Jaremko, J. L., Langer, S. G., Borondy Kitts, A., Birch, J., Shields, W. F., van den Hoven van Genderen, R., Kotter, E., Wawira Gichoya, J., Cook, T. S., Morgan, M. B., Tang, A., Safdar, N. M., & Kohli, M. (2019). Ethics of Artificial Intelligence in Radiology: Summary of the Joint European and North American Multisociety Statement. Radiology, 293(2), 436–440. https://doi.org/10.1148/radiol.2019191586

Gerke, S., Babic, B., Evgeniou, T., & Cohen, I. G. (2020). The need for a system view to regulate artificial intelligence/machine learning-based software as medical device. Npj Digital Medicine, 3(1), 1–4. https://doi.org/10.1038/s41746-020-0262-2

Gibson, W., & Brin, D. (2018, October 22). The Science in Science Fiction [NPR Podcast]. https://www.npr.org/2018/10/22/1067220/the-science-in-science-fiction

Goecks, J., Jalili, V., Heiser, L. M., & Gray, J. W. (2020). How Machine Learning Will Transform Biomedicine. Cell, 181(1), 92–101. https://doi.org/10.1016/j.cell.2020.03.022

He, J., Baxter, S. L., Xu, J., Xu, J., Zhou, X., & Zhang, K. (2019). The practical implementation of artificial intelligence technologies in medicine. Nature Medicine, 25(1), 30–36. https://doi.org/10.1038/s41591-018-0307-0

Hospitalist. (2020, February). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Hospitalist. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Hwang, T. J., Kesselheim, A. S., & Vokinger, K. N. (2019). Lifecycle Regulation of Artificial Intelligence– and Machine Learning–Based Software Devices in Medicine. JAMA, 322(23), 2285–2286. https://doi.org/10.1001/jama.2019.16842

Jemielniak, D., Masukume, G., & Wilamowski, M. (2019). The Most Influential Medical Journals According to Wikipedia: Quantitative Analysis. Journal of Medical Internet Research, 21(1), e11429. https://doi.org/10.2196/11429

Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1(9), 389–399. https://doi.org/10.1038/s42256-019-0088-2

Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G., & King, D. (2019). Key challenges for delivering clinical impact with artificial intelligence. BMC Medicine, 17(1), 195. https://doi.org/10.1186/s12916-019-1426-2

Kennedy, G., & Gallego, B. (2019). Clinical prediction rules: A systematic review of healthcare provider opinions and preferences. International Journal of Medical Informatics, 123, 1–10. https://doi.org/10.1016/j.ijmedinf.2018.12.003

Larson, D. B., Magnus, D. C., Lungren, M. P., Shah, N. H., & Langlotz, C. P. (2020). Ethics of Using and Sharing Clinical Imaging Data for Artificial Intelligence: A Proposed Framework. Radiology, 192536. https://doi.org/10.1148/radiol.2020192536

Littmann, M., Selig, K., Cohen-Lavi, L., Frank, Y., Hönigschmid, P., Kataka, E., Mösch, A., Qian, K., Ron, A., Schmid, S., Sorbie, A., Szlak, L., Dagan-Wiener, A., Ben-Tal, N., Niv, M. Y., Razansky, D., Schuller, B. W., Ankerst, D., Hertz, T., & Rost, B. (2020). Validity of machine learning in biology and medicine increased through collaborations across fields of expertise. Nature Machine Intelligence, 2(1), 18–24. https://doi.org/10.1038/s42256-019-0139-8

Liu, X., Rivera, S. C., Faes, L., Ferrante di Ruffano, L., Yau, C., Keane, P. A., Ashrafian, H., Darzi, A., Vollmer, S. J., Deeks, J., Bachmann, L., Holmes, C., Chan, A. W., Moher, D., Calvert, M. J., Denniston, A. K., & The CONSORT-AI and SPIRIT-AI Steering Group. (2019). Reporting guidelines for clinical trials evaluating artificial intelligence interventions are needed. Nature Medicine, 25(10), 1467–1468. https://doi.org/10.1038/s41591-019-0603-3

Marx, V. (2019). Machine learning, practically speaking. Nature Methods, 16(6), 463–467. https://doi.org/10.1038/s41592-019-0432-9

Matheny, M. E., Whicher, D., & Israni, S. T. (2020). Artificial Intelligence in Health Care: A Report From the National Academy of Medicine. JAMA, 323(6), 509–510. https://doi.org/10.1001/jama.2019.21579

Medical Registered Nurse. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Morley, J., & Floridi, L. (2020). An ethically mindful approach to AI for health care. The Lancet, 395(10220), 254–255. https://doi.org/10.1016/S0140-6736(19)32975-7

Nagendran, M., Chen, Y., Lovejoy, C. A., Gordon, A. C., Komorowski, M., Harvey, H., Topol, E. J., Ioannidis, J. P. A., Collins, G. S., & Maruthappu, M. (2020). Artificial intelligence versus clinicians: Systematic review of design, reporting standards, and claims of deep learning studies. BMJ, 368. https://doi.org/10.1136/bmj.m689

Nelson, C. A., Pérez-Chada, L. M., Creadore, A., Li, S. J., Lo, K., Manjaly, P., Pournamdari, A. B., Tkachenko, E., Barbieri, J. S., Ko, J. M., Menon, A. V., Hartman, R. I., & Mostaghimi, A. (2020). Patient Perspectives on the Use of Artificial Intelligence for Skin Cancer Screening: A Qualitative Study. JAMA Dermatology. https://doi.org/10.1001/jamadermatol.2019.5014

Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453. https://doi.org/10.1126/science.aax2342

Parikh, R. B., Teeple, S., & Navathe, A. S. (2019). Addressing Bias in Artificial Intelligence in Health Care. JAMA, 322(24), 2377–2378. https://doi.org/10.1001/jama.2019.18058

Park, S. H., Do, K.-H., Kim, S., Park, J. H., & Lim, Y.-S. (2019). What should medical students know about artificial intelligence in medicine? Journal of Educational Evaluation for Health Professions, 16. https://doi.org/10.3352/jeehp.2019.16.18

Pinto dos Santos, D., Giese, D., Brodehl, S., Chon, S. H., Staab, W., Kleinert, R., Maintz, D., & Baeßler, B. (2019). Medical students’ attitude towards artificial intelligence: A multicentre survey. European Radiology, 29(4), 1640–1646. https://doi.org/10.1007/s00330-018-5601-1

Plastic & Reconstructive Surgeon. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Polesie, S., Gillstedt, M., Kittler, H., Lallas, A., Tschandl, P., Zalaudek, I., & Paoli, J. (n.d.). Attitudes towards artificial intelligence within dermatology: An international online survey. British Journal of Dermatology, n/a(n/a). https://doi.org/10.1111/bjd.18875

Price, W. N., Gerke, S., & Cohen, I. G. (2019). Potential Liability for Physicians Using Artificial Intelligence. JAMA, 322(18), 1765–1766. https://doi.org/10.1001/jama.2019.15064

Primary Care Physician. (2020a, February). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Primary Care Physician. (2020b, February). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Psychiatrist. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Radiation Oncologist. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Radiologist. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Rajkomar, A., Dean, J., & Kohane, I. (2019). Machine Learning in Medicine. New England Journal of Medicine, 380(14), 1347–1358. https://doi.org/10.1056/NEJMra1814259

Rampton, V., Mittelman, M., & Goldhahn, J. (2020). Implications of artificial intelligence for medical education. The Lancet Digital Health, 2(3), e111–e112. https://doi.org/10.1016/S2589-7500(20)30023-6

Sarwar, S., Dent, A., Faust, K., Richer, M., Djuric, U., Van Ommeren, R., & Diamandis, P. (2019). Physician perspectives on integration of artificial intelligence into diagnostic pathology. Npj Digital Medicine, 2(1), 1–7. https://doi.org/10.1038/s41746-019-0106-0

Sendak, M. P., Gao, M., Brajer, N., & Balu, S. (2020). Presenting machine learning model information to clinical end users with model facts labels. Npj Digital Medicine, 3(1), 1–4. https://doi.org/10.1038/s41746-020-0253-3

The Medical Futurist. (2020). FDA-approved A.I.-based algorithms. The Medical Futurist. https://medicalfuturist.com/fda-approved-ai-based-algorithms

Trauma Registered Nurse. (2020, February). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Urologist. (2020, March). Clinician Interview (H. Goldberg, Interviewer) [Personal communication].

Vollmer, S., Mateen, B. A., Bohner, G., Király, F. J., Ghani, R., Jonsson, P., Cumbers, S., Jonas, A., McAllister, K. S. L., Myles, P., Grainger, D., Birse, M., Branson, R., Moons, K. G. M., Collins, G. S., Ioannidis, J. P. A., Holmes, C., & Hemingway, H. (2020). Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness. BMJ, 368. https://doi.org/10.1136/bmj.l6927

Wang, F., Kaushal, R., & Khullar, D. (2020). Should Health Care Demand Interpretable Artificial Intelligence or Accept “Black Box” Medicine? Annals of Internal Medicine, 172(1), 59. https://doi.org/10.7326/M19-2548

What is AI? (2014, September 5). The Society for the Study of Artificial Intelligence and Simulation of Behaviour. https://aisb.org.uk/what-is-ai/

Exhibit A: Clinician Interview Guide

Context

● I am a dual degree MPH/MBA grad student at UC Berkeley, focusing on machine learning in healthcare

● I am working on my Master’s capstone on how machine learning, or ML, might be used in assisting clinicians in screening, population health triage, diagnostics, and/or monitoring

● I am interviewing a diverse set of clinicians — physicians, NPs, RNs, and others — from diverse backgrounds — across the country, different specialties, ranges of tenure, and different levels of interest or adoption of ML tools

● Your name associated with the insights from this interview will be kept private amongst myself and my readers. I will ultimately seek to self-publish select summaries of my interviews, but those will be de-identified and approved by the interviewee prior to distribution

● Do you have any questions on process or expectations?

1. Familiarity with ML in healthcare

● To start off, what have you heard about artificial intelligence and/or machine learning in healthcare?

2. Past and future use

● Have you used any ML tools? Would you?

3. Excitement and concerns

● What about ML in healthcare is concerning or exciting for you? What else is exciting or concerning for you?

4. Ethics and privacy

● Where do ethics play into this? What could go wrong, or what could be done well?

● How does privacy fit into all of this?

● How should the data be used? Who should or should not have access to it?

● Who else should help inform you or decide for you if an ML tool is ethical and sufficiently private?

● Do you trust these ML tool developers to have access to these data? Why or why not?

5. ML knowledge and model explainability

● At what level do you need to understand how the model makes its prediction?

6. External validation needs

● For you to be willing to use an ML tool, what external validation would you need to see? What types of government and/or non-government institutions would play a role?

7. Clinical education

● How would clinical education be impacted?

8. Desired use cases

● Where are there opportunities to assist clinicians with ML? Imagine this: A world-class technology company, developed an ML tool that suggests possible diagnoses or triages a patient population. What is the best thing for them to build now and why?

9. Implementation

● When an ML tool gets implemented, how should that be done? Who should have access first; who should not?

10. Buying process

● Where you practice medicine, who all would be involved in making the decision to purchase and use an ML tool?

● What data, references, and promises would they need to learn about to ultimately say yes or no?

Exhibit B: Secondary Sources

--

--

Harry Goldberg
Building Trust and Adoption in Machine Learning in Healthcare

Beyond healthcare ML research, I spend time as a UC Berkeley MBA/MPH, WEF Global Shaper, Instant Pot & sous vide lover, yoga & meditation follower, and fiance.