Reflections From AMIA Informatics Summit 2022

Published in

Digital Rounds

4 min readApr 22, 2022

In March 2022, the AMIA Informatics Summit took place in Chicago. There were many interesting discussions and presentations that took place on topics at the forefront of informatics, particularly with a focus on clinical informatics. In this post, we will highlight a few interesting talks and reflections we had from the conference.

AMIA Translational Bioinformatics in Review

The highlight of this conference every year is the AMIA Translational Bioinformatics in review by Nicholas Tatonetti, PhD. This is the review of 2021 for this year’s conference: Link to powerpoint.

Overall, I appreciated the quick summary highlights of papers from the presentation. Some of the papers I enjoyed include experimental methods to capture translational dynamics (a great way to tie in the time axis into RNA seq analysis with some cool visualizations), the use of transfer learning as a small step towards considering disparities in data representation for machine learning modes, and creative ways to apply deep learning models for drug screening and identifying transcriptomic-imaging relationships.

Cell trajectory visualizations, from Sci-fate characterizes the dynamics of gene expression in single cells (from Figure 3)

The main critique I have for this review is in the narrow definition of translational bioinformatics that cover mainly genetic and molecular studies. In the past year, there have been many clinical and digital health studies and tools that have been used for health monitoring. We have seen how these studies and tools have more immediately impacted the world’s ability to identify, contain, and manage COVID19. In my opinion, informatics tools that utilize clinical measurements, digital biomarkers, and even environmental or population metrics that are all relevant to health and biology with translational context, should be part of the definition of ‘translational bioinformatics’. We are also seeing great patient health management through digital apps and tools. It is worthwhile to acknowledge these efforts.

Phenotyping Approaches

One of the hardest parts of working with electronic medical record (EMR) data is defining a cohort. Even a disease term can mean so many things (What does it mean to identify a ‘diabetes’ cohort? How do you identify the patients using data extracted from the EMR? How can these identification algorithms be generalizable?). The AMIA conference included multiple talks that discussed frameworks to approach the goal of ‘phenotyping’.

It helps to think about framing phenotyping algorithms similar to the selection of model organisms or animals in non-human studies. Since hospital systems differ depending on factors such as patient population, location, clinical staff, and more, it is very difficult to identify ‘standard’ phenotypes that can be applied to all clinical sites. Often, this also means that research findings utilizing EMR data will always have to be interpreted in context of the cohort characteristics and by extension, its limitations for generalizability. Even better is if findings can be validated in external sites.

Phenotypes can be obtained in multiple ways: from a list of codes, from a workflow, or from more complex identification due to machine learning models or a probabilistic cohort assignment. Nevertheless, one great idea was to have a central repository for phenotype definitions, similar to bio-protocol.org or GEO for easy sharing, and to remind researchers to think about generalizability and phenotype portability before approaching a research question using EMR data. Here is a referenced paper on considerations for phenotype libraries for future studies. Here is also a list of current phenotyping workflows or phenotyping libraries:

Some Talks

There were concurrent sessions with a variety of cool topics and talks (program here). I won’t be linking to any references due to uncertainty about whether these talks have been published, but I will mention some cool ideas that were covered.

These ideas include (1) automatic radiology report generation (2) transfer learning for multi-site machine learning model portability and (3) summarization of clinical notes for clinicians. There were also great talks regarding social determinants of health (SDoH) and considerations for health equity in machine learning models. We will expand more upon those topics in a future blog post.

Data Work Matters

An important lesson I would like to emphasize from the conference, is the importance of data quality and preprocessing, particularly in the context of machine learning models for high-stake decision making. This is a paper from Google Research demonstrating the importance of good data.

Abstract for “Everyone wants to do the model work, not the data work”

Summary

The AMIA Informatics Summit was a great and informative conference, with talks covering topics ranging from negotiation skills, NLP workshops, data wrangling, health equity, and so much more. A very few talks were highlighted in this post, but there were still many talks that were not even mentioned at all. Nevertheless, not only will there be more informatics related conferences and events, but stayed tuned as we will cover more informatics related conversations in this blog in the future!