AI in Medicine

Finding data and building trust in a hyper-regulated industry

Rebecca Resnick
South Park Commons



South Park Commons recently brought together a panel of four artificial intelligence and health experts. Key insights:

  • Finding usable health data is tough
  • Regulation lengthens the time horizon to launch anything new
  • There is a disconnect between the promise of AI-driven health medicine and the way doctors work today


Over the past few years, interest in applying the techniques of machine learning outside of traditionally software-driven disciplines has exploded. The field of medicine, with its rich and increasingly digitized cache of patient-, hospital-, and drug-related data, is a ripe area for Artificial intelligence-driven research and invention.

As part of its AI speaker series, South Park Commons recently brought together a panel of 4 experts from the medical and software world to discuss trends in this growing space.

First, a quick introduction to the panelists (see more info on the panelists at the end of this post):

Susan Huang is a practicing dermatologist and Director of Innovation at the Palo Alto Medical Foundation Department of Dermatology.

Andrew Dai is a staff software engineer and researcher at Google Brain leading a team focused on developing better understanding of medical records by applying deep learning techniques.

Sasha Targ is an MD-PhD student at the University of California, San Francisco interested in applying computational approaches to solve problems in genomics and medicine.

Nikhil Buduma, the panel moderator, is Co-founder and Chief Scientist of Remedy Medical.

What happened at the panel?

The application of AI to medicine is a rich area of academic study and commercial innovation. What follows are my impressions of the key themes discussed during the panel.

Finding usable health data is tough

As you might expect for one of the most heavily regulated and privacy-conscious industries, accessing useful tranches of health-related data is difficult. Dr. Huang spoke about the need for potential partners to demonstrate their trustworthiness — both in terms of security and ability to fulfill regulated privacy requirements — in handling patient records, which can present sometimes insurmountable hurdles for newcomers to the space.

This sparked a discussion on ‘de-identification’, the process by which personally identifiable information (PII) is removed from patient records before they are used and/or passed to other parties. There are two methods of de-identification, as described by the Health Insurance Portability and Accountability Act (HIPAA): 1) removing a strict list of information, including full face photos, biometric identifiers, and geographic information beyond the state level or 2) having an ‘expert’ come up with and implement a potentially more lax set of de-identification principles that they determine sufficiently limit the risk that the data could be used, either alone or with any other potential data sources, to identify any individual.

These two methods present opposite challenges: the first is easy to implement, but strict (it risks removing important information for solving some problems), while the second is hard to implement but allows more flexible access to relevant data. As Sasha pointed out, the scientists building AI often lack the necessary background and/or resources to create the de-identification principles (the second method) and assess their risks. Andrew mentioned MIMIC, an MIT-driven open source dataset of de-identified critical care patient data as a boon for researchers interested in that medical subtopic. Sasha also pondered whether there might be ways to incentivize doctors to create de-identified versions of records as they are documenting information for standard records. No easy answers here.

All the panelists were optimistic about finding ways to augment the patient’s medical record data. The idea of getting patients to log their health background came up several times as a way to augments doctors’ notes. The panelists also discussed other potential data sources, such as automatic recordings of patient-doctor interactions, and genetic sequencing data, which is becoming more prevalent as the cost to sequence declines. However, even though the uses for these data are comparatively clear for researchers, there’s not (currently) a clear incentive to log them. More on that below.

Beyond the baseline hurdle of accessing data, the panelists also spoke about the difficulties in making sense of it. Andrew delved into the challenge of algorithms predicting which drugs to prescribe based on doctors’ free-text notes. Doctors prescribe different drugs in differing amounts depending on the patient and situation. He suggested that natural language processing techniques might bear fruit (he likened the problem to creating a translation mapping between two languages, given only a set of documents of unknown relation in each language), but this, too, remains an unsolved.

Regulation lengthens the time horizons to launch anything new

Echoing the challenges with accessing sensitive health data discussed above, a ‘move fast and break things’ approach doesn’t really fly in a heavily regulated world in which human lives are at stake. Regulation affects all parts of the funnel, from data access, to bringing devices and treatments to market, to introducing data-driven changes to patient-doctor relationships. Predictably, this significantly lengthens the time it takes to launch, or drive adoption from, anything new.

At one point, Nikhil asked the panelists to describe their 5–10 year predictions for the future of health tech. Their answers were…conservative. Five to 10 years, as Andrew pointed out, is a very short time frame in medicine. Maybe, he hoped, there would be more sharing of data across hospitals, which could benefit patients and lead to new AI-driven product opportunities. Sasha echoed this, suggesting that genetic sequencing would become more prevalent, leading to a more expansive genetics dataset. Dr. Huang took a different tack and discussed incremental improvements to the current patient experience that could be driven by data: better appointment booking experiences, lower wait times to see a doctor, and smarter patient triage.

Looking fifty years out, several of the panelists saw a shift in the patient-doctor relationship: from one in which a patient interacts with multiple specialist service providers, to one in which a single clinician, aided by much more advanced predictive health technology, acts as a ‘coach’ through the patient’s health issues.

There is a disconnect between the promise of AI-driven health medicine and the way doctors work today

Some of the most promising medical datasets don’t exist yet due to the lack of immediate business case. Take the drug prescription dataset mentioned earlier. For the panelists, the promise of this data is clear. Andrew spoke about the potential to recommend the most effective drug combinations based on the case specifics. However, generating these models requires either making sense of doctors’ free-text notes (a significant challenge, as discussed above) or getting doctors to log training data in a new, more structured format. The challenge with the structured data logging approach is that it requires doctors to change their patient workflow based on a promise of something that might improve their practice in the future, a hard sell given their already onerous charting requirements.

Data logging in Intellichart for ophthalmic patients includes both detailed categorization and support for hand-drawn and -annotated images. This application includes some automation — for example, if the doctor writes ‘retinal tear in superotemporal quadrant’, the software will automatically draw the tear in the right location. However, in spite of the application’s initial promise to help improve charting via smart recommendations, the automation remains, in the words of one retina surgeon, ‘disappointingly rudimentary’.

The difficulty of driving adoption for AI-driven tools also extends beyond incentivizing the creation of training data. In order for doctors to use machine-created recommendations, they need to trust them. And the evidence on what drives doctor trust is decidedly mixed. Dr. Huang discussed interpretability — the ability of a human to understand why the model came up with a particular outcome — as a key driver of doctors’ trust. She referenced studies demonstrating that doctors trust recommendations more if they come with a text-based description (e.g. ‘5cm nodule on the left lobe of the lung’) rather than simply providing a highlighted image, even if the text provides no extra information. Doctors are also more likely to trust models, according to Dr. Huang, if the model creators are transparent about how the model has been trained and precisely when its recommendations are worth considering. She gave an example of a skin cancer detection model trained only a dataset of patients with highly similar skin tones. Without this crucial piece of information, doctors are naturally (and rightly) more skeptical of employing model-based recommendations.

Andrew countered that trust is built by observed historical model performance. He referenced a different study that demonstrated that, if given the choice between a highly accurate black box treatment recommendation model and a less accurate but highly-understandable model, doctors would choose the former.

It’s clear that while doctors’ trust in the model is of fundamental importance in rolling out AI-driven recommendations, the exact modality of gaining that trust is not yet settled. And without settling this key question, there will continue to be a catch-22 in AI-driven products aimed at enhancing doctors’ practice: doctors need to believe in a model in order to help train it, but it’s difficult (and sometimes impossible) to train the model without doctors’ help upfront.

Wrap up

I came away from the panel with a better understanding of the challenges facing those seeking to innovating in the health space. In many ways, these issues are not specific to AI-driven innovation. Regulation, incentives, and data access are core drivers for potential change to the health system. Generating usable data and creating machine-driven recommendations that doctors (and patients) actually trust add extra hurdles to new products driven by machine learning.

This panel mainly focused on how the treatment experience would change with improvements in artificial intelligence. An interesting topic for a follow-up discussion would be how treatments themselves might change, e.g. with new/faster drug development driven by AI and better robotic surgical techniques.

More on the panelists

Susan Huang is Director of Innovation of the Palo Alto Medical Foundation Department of Dermatology, a practicing board certified dermatologist and healthcare consultant in AI at a large tech company. She sits on the American Academy of Dermatology’s Task Force on Augmented Intelligence. She was formerly a Quality Improvement Director at Beth Israel Deaconess Medical Center, a teaching hospital of Harvard Medical School. Dr. Huang’s interests are in healthcare delivery models and practical applications of technology including AI in healthcare.

Andrew Dai is a staff software engineer and researcher at Google, with Google Brain, where he leads a group researching applying deep learning to medical records. Prior to that he studied an MA in Computer Science at the University of Cambridge before receiving a PhD at the University of Edinburgh in 2012 for text modeling with Bayesian nonparametrics. After graduation, he worked at Google in a range of teams including machine translation, Google Now and Google Ads. 5 years ago, he joined the Google Brain team focusing on deep learning where he has published on text representation, semi-supervised learning and deep learning on medical data.

Sasha Targ is an MD-PhD student at the University of California, San Francisco interested in applying computational approaches to solve problems in genomics and medicine. Sasha studied biology and physics at MIT and graduated Phi Beta Kappa in three years in order to pursue research full time. She previously conducted six years of basic immunology research into mechanisms of antibody development that could be used to create better vaccines and on methods that efficiently characterize patients with autoimmunity, resulting in Science and Nature Biotechnology coauthorships.

Nikhil Buduma is the author of O’Reilly’s Fundamentals of Deep Learning and the Co-Founder and Chief Scientist of Remedy, a San Francisco-based company building technology that enables ordinary people to diagnose, monitor, and manage disease with “super-physician” efficacy. Nikhil participated in the International Biology Olympiad twice, attended MIT and invests in hard technology and data companies through his venture fund, Q Venture Partners.

South Park Commons is a community that helps entrepreneurs and technologists freely learn and start ambitious projects. We bring together talented people to share ideas, explore directions, and realize opportunities that occur in an environment that helps people take risks. People don’t come to the commons to play it safe — we’re here to dive off the deep end.

As part of that journey, we bring in some of the best and brightest minds working on the cutting edge of tech to speak to our community. Follow us here to get updated when we put out new content.



Rebecca Resnick
South Park Commons

Tech, cats, math, knitting, puns and stuff. Work at Twitter. Chicago girl at heart.