Using Computer Vision to Identify Critical Anatomy in Surgical Images

Published in

Hasty.ai

5 min readAug 16, 2021

Julia Gong, F. Christopher Holsinger, Julia E. Noel, Sohei Mitani, Jeff Jopling, Nikita Bedi, Yoon Woo Koh, Lisa A. Orloff, Claudio R. Cernea, Serena Yeung

About a quarter of our users here at Hasty.ai are part of the research community. We reached out to them to learn more about their research. Some of the stories were so great that we asked the researchers to write a summary that we can share on our blog.

Today, we’re excited to share Julia Gong’s work. She has been a Hasty user since the early days and is studying at Stanford where she is part of the Stanford Medical AI and Computer Vision Lab (MARVL). MARVL develops artificial intelligence and machine learning algorithms to enable new capabilities in biomedicine and healthcare.

You can find Julia on Twitter, LinkedIn, or on her personal homepage.

Surgeons performing an operation. Image from Pexels.

Why thyroidectomy?

Standardized imaging methods such as CT scans, MRIs, dermoscopic images, retinal scans, and whole-slide pathology images have enabled large-scale deep learning-powered image analysis in their respective fields. However, the implementation of computer vision methods lags behind in surgery. Standardized imaging methods are not currently integrated into the surgical workflow (especially not in open surgery), but there are a lot of opportunities for computer vision to assist the operating surgeon, which we explore in our work.

So, why the thyroid? Thyroidectomy is one of the most common operations performed in the United States — with about 150,000 cases annually in the United States alone [1]. Though frequently performed, thyroidectomy requires that surgeons navigate a complex microanatomy to preserve speech and swallowing and to ensure the health of nearby parathyroid glands. For instance, the human recurrent laryngeal nerve (RLN), located just under the thyroid gland, innervates the vocal cords and helps us speak, breathe and eat safely. Given this complex anatomy, we ask the questions: Can we develop computer vision tools to assist surgeons during thyroidectomy to identify the RLN? And how do operating room image capture conditions affect such a method’s performance?

What’s already been done?

Early work has examined developing computer vision methods for anatomical identification (among other tasks) in endoscopic, laparoscopic, and robot-assisted surgery. While endoscopy is primarily diagnostic in nature, laparoscopic and robot-assisted surgery have a higher degree of complexity and variability, with surgical activity and instruments moving free-form in constrained fields of view. Taking it a step further, open surgery has even greater scene complexity and variability, and thus very few groups have examined anatomical identification for open surgery. Our work thus not only seeks to develop a deep learning-enabled computer vision system to identify and measure critical soft tissue during thyroidectomy but also aims to lay the foundation for future translational work bringing AI to vision and discrimination for open surgery. Please see our paper [2] for further literature and details.

How we used data to find the recurrent laryngeal nerve (RLN)

To train a computer vision model to accurately segment the RLN, we first needed to collect image data containing the RLN. We used retrospectively acquired, de-identified images taken during thyroidectomy and obtained the necessary ethical approval (see our paper for details). In total, we collected 277 color (RGB) photographs from 130 patients, which contained a diverse array of procedure types and perspectives on surgical anatomy. Due to the high diversity of images in our dataset, we also labeled images for two image conditions: lighting and distance to surgical anatomy. Analyzing our segmentation results using these image meta-tags helped us to answer our second research question: How do operating room image capture conditions affect our segmentation method’s performance?

Data annotation using Hasty.ai. Note that this image is the same as Supplementary Figure S3 in the original paper, which is published by Scientific Reports.

To obtain the ground-truth RLN segmentations for these images, each image was carefully and manually annotated by surgeons and reviewed by a senior surgeon. We used the Hasty.ai platform to collect these annotations from our clinical collaborators; in particular, they used the polygon and brush tools to create detailed annotations that remained faithful to the borders of the nerve tissue. Surgeons also created segmentations for retractors, which we used in the second stage of our pipeline — nerve width estimation. We also used the bounding box tool to annotate bounding boxes around the wound region for a subset of 136 images, which enabled us to train our cropping model. The cropping model was used to crop input images to the wound region prior to nerve segmentation, which reduced the clutter in images that were not centered on the area of interest. To export the data in a format suitable for model training, it only took a few clicks on the Hasty.ai platform.

This collection of images taken following thyroidectomy (along with their annotations) are, to our knowledge, the largest such dataset to date. We are excited to be pushing the frontiers of surgical vision by presenting this dataset along with our end-to-end method.

Illustration of our dataset, which is diverse across both brightness and picture distance image conditions. Note that this image is the same as Figure 2 in the original paper, which is published by Scientific Reports.

Takeaways

Open surgery is a challenging environment for computer vision algorithms. In this work, we investigate anatomical identification during thyroidectomy, one common type of open surgery. Our end-to-end recurrent laryngeal nerve (RLN) segmentation and measurement method demonstrates the potential of using computer vision algorithms to augment intraoperative decision-making. Please see our paper for details on our methods, results, and analysis, which we hope will spur on further research in integrating AI methods into open surgery and surgical workflows.

About Hasty.ai

Hasty.ai is an AgileML platform for building Vision AI-powered products. Annotate your image and video data, then build, optimize, and deploy your computer vision models into your product — faster and more reliably.

References

[1] Al-Qurayshi, Z., Robins, R., Hauch, A., Randolph, G. W. & Kandil, E. Association of Surgeon Volume With Outcomes and Cost Savings Following Thyroidectomy: A National Forecast. JAMA Otolaryngol. Head Neck Surg. 142, 32–39 (2016).

[2] Gong, J., Holsinger, F.C., Noel, J.E. et al. Using deep learning to identify the recurrent laryngeal nerve during thyroidectomy. Sci Rep 11, 14306 (2021). https://doi.org/10.1038/s41598-021-93202-y