AutoSCOPE: The Assembly Line of AI Model Development at Lunit

Published in

Lunit Team Blog

9 min readMay 13, 2024

What is AutoSCOPE?

Welcome to the future of AI model development, where building AI models is as systematic and efficient as Henry Ford’s pioneering assembly lines in the automotive industry. Just as Henry Ford’s assembly lines dramatically increased manufacturing efficiency, AutoSCOPE transforms AI model development within Lunit’s oncology department. It automates the complex and often repetitive tasks associated with developing AI models.

Assembly line for building deep learning models (Drawn with GPT-4o)

AutoSCOPE serves as an all-in-one deep AI development framework, guiding users through the entire lifecycle of AI model development — from managing datasets and training models to making predictions on whole slide images. Its user-friendly web interface simplifies the AI model development, enabling the development of a proof-of-concept model with just a few clicks. AutoSCOPE ensures that every step is clearly defined and integrated, making the process highly efficient and user-centric.

Why Do We Need AutoSCOPE?

In the fast-paced world of medical research and pharmaceuticals, the traditional method of developing AI models is slow and filled with complex steps. For example, our internal guide for each training and inferencing AI models involves nearly ten manual steps, which shows just how cumbersome the old way can be. Moreover, there is insufficient manpower among researchers to develop multiple models simultaneously. Additionally, the need for constant communication among various teams within the oncology department creates further delays and complications. This is where AutoSCOPE comes in, not just as a tool, but as a strategic asset designed to overcome these challenges.

Accelerated Development Cycle: AutoSCOPE streamlines the entire process, from data handling to whole slide image inference, significantly cutting down the time and steps involved.
Exploring More Product Opportunities : AutoSCOPE provides the scalability needed to explore the more opportunities in oncology research, especially in collaboration with pharmaceutical companies. It simplifies technical processes, allowing our team to efficiently explore new products without being hindered by complexity.
Improves Team Coordination: AutoSCOPE serves as a central hub for all development activities, which helps different teams work together more effectively. It reduces the need for constant meetings and updates, which can slow down progress.
Allowing Domain Experts to Train Models: AutoSCOPE enables domain experts, including medical directors — who might lack AI expertise— to directly train AI models and integrate their domain knowledge. This capability ensures that the AI models developed are both accurate and clinically relevant, enhancing the effectiveness of diagnostic tools.Introduction to AutoSCOPE Workflow

Workflow in AutoSCOPE: Example scenario illustrating the step-by-step process. INCL is Lunit’s cloud training platform and AICP is an internal annotation tool.

AutoSCOPE simplifies AI model development into clear, manageable steps. From the initial setup of projects to the final inference on whole slide images, it provides a structured workflow that guides users seamlessly through each part of the process. Let’s dive into each step to see how AutoSCOPE streamlines these complex tasks!

Creating a Project

In AutoSCOPE, a project acts as the foundational element for managing and organizing the AI model development process. Each project is associated with a specific product and organ, setting the focus for the research or application at hand. The project encapsulates all essential components, from datasets to inference models, and serves as a centralized hub for navigating and managing these elements.

Image Ingestion

At this stage, whole slide images (WSIs) along with their metadata are uploaded. This process is enhanced by features for searching, filtering, and sorting within the table, allowing users to efficiently locate and organize images based on specific criteria. Each image’s detailed page provides further information, including file paths and associated datasets, facilitating a comprehensive understanding of each image’s role in various projects. This streamlined approach supports effective management and use of image data essential for AI model development in oncology.

Image management in AutoSCOPE: Efficient handling of WSIs and metadata for AI development

Dataset Ingestion

In AutoSCOPE, dataset ingestion involves organizing patches extracted from whole slide images (WSIs), annotated at various levels like cell, tissue, and organ, ready for model training. On the dataset detail page, users can view essential details. Visualization tools on this page allow users to view the actual patches and their annotations, enhancing the analysis process. Additionally, dataset statistics provide valuable insights into sample distributions, helping users understand data characteristics more deeply.

Dataset Visualizer in AutoSCOPE: Viewing and analyzing annotated patches from WSIs

Dataset Statistics in AutoSCOPE: Insights into sample distributions for informed model training

Running Experiments

AutoSCOPE automates the model tuning process using advanced AutoML algorithms. This step allows users to effortlessly train AI models while monitoring various performance metrics and visualizations.

Configuring the Experiment: To start a new experiment, users select the project and the dataset version on which the models will be trained. They also choose the type of model to train, such as a cell detection model. AutoSCOPE provides a pre-defined model architecture and learning procedures, where users only need to define the metrics for determining the optimal model. Users can enhance the starting point of the models by selecting a pretrained model from the Model Zoo, which can significantly improve performance and efficiency.

Setting up an experiment: Only few clicks are needed for model training

Monitoring and Adjusting: Once the experiment is initiated, users can monitor its progress and status through the experiment management table. The detailed page for each experiment provides insights into the settings, status, and the outcomes of individual experiment jobs. Users can track the success of each job and analyze visual results to refine their models further.

Model monitoring: Tracking experiment progress and predictions

Model Output Visualizer: Displaying and analyzing AI model results

Advanced Customization: For advanced users, AutoSCOPE allows detailed customizations of the experiment settings, including modifying the hyperparameter optimization settings, the use of custom loss functions and tweaking other experiment parameters through the advanced settings interface. These customizations can be applied to adapt the training process to specific research needs or experimental goals.

Advanced Customization: Tailoring experiment settings to specific research needs

Model Archiving

After a model is trained, we conduct both metric comparisons and visual inspections to identify the most promising model. Once selected for its effectiveness, the model is archived using TorchServe, ensuring it is readily accessible for future inference tasks. To facilitate archiving, users simply choose the desired model from the experiment run. This process automatically captures all essential details, enhancing traceability.

Making a Package

This step involves compiling a package that includes a set of model archives tailored for specific applications. These packages are designed to meet the diverse needs of the oncology department, allowing for flexible and targeted applications of the models developed. Each package’s details and history are accessible for review, and users can download the package or export it as a Docker image for deployment. This functionality facilitates the sharing of models, allowing them to be easily integrated and utilized in diverse operational environments.

Creating a Package: Compiling and customizing model archives for targeted applications

WSI Inference

The final step involves using the packages to perform inference on WSIs, generating detailed visualizations to support further research and clinical decisions. This process involves two key components: a package containing trained models, and a list of valid WSIs specified as an inference context. Advanced settings can be adjusted to optimize resource use and inference speed, such as the number of parallel instances and memory requirements.

Upon completion, users can access the inference results through links to raw output files or visual interfaces for detailed visualization, ensuring comprehensive review and analysis of the data processed.

WSI Visualizer: Displaying and analyzing AI model results

How AutoSCOPE Simplifies Complex AI Development

Building AI models involved a lot of manual work and constant back-and-forth between different teams, which could slow things down and make the process complicated. AutoSCOPE changes this by simplifying and speeding up how AI models are developed. If you’re wondering how this was possible, here’s a breakdown of challenges and solutions that enable it:

Developing a Robust Foundation: The effectiveness of AutoSCOPE starts with its core — advanced models and algorithms that have been refined over many years by many researchers at Lunit. Building this foundation required extensive expertise and persistent effort to ensure reliability and quality. These models provide a strong, consistent starting point, simplifying the initial steps of any project.
Process Standardization: Simplifying the AI development process involves more than just automation; it requires carefully designing the system to hide unnecessary complexities while retaining essential functionalities. Standardizing these processes was a major challenge that involved understanding the fine details of various development tasks and figuring out which parts could be streamlined without losing effectiveness. This effort enables AutoSCOPE to automate many steps, reducing manual work and the need for continuous communication between teams.
Advanced AutoML Algorithm: Integrating automated hyperparameter optimization into AutoSCOPE was another challenge. It adjusts the model’s parameters to ensure optimal performance. Developing a system that consistently matches or exceeds expert performance involved a lot of testing and refinement to ensure reliability across different scenarios.

By tackling these complex challenges, AutoSCOPE offers a powerful tool that lets users concentrate on using AI effectively, without getting overwhelmed by the detailed steps of the development process.

How AutoSCOPE Improved Model Development Process

AutoSCOPE has significantly improved how AI models are developed, as shown by its use in recent projects. By automating and simplifying the development process, AutoSCOPE has proven its effectiveness in real-world scenarios:

Efficient Development of New Products: Recently, AutoSCOPE was utilized in developing a new product. Traditionally, such development required manual coding and configuration, which not only extended the timeline but also increased the likelihood of errors. With AutoSCOPE, these steps were condensed into a few simple clicks.

Reduced Unnecessary Communications: Originally, developing AI models involved multiple teams, which could slow down the process with back-and-forth communications. AutoSCOPE allows to run analyses and refine models without always needing to consult AI researchers. This reduction in unnecessary communications speeds up the overall workflow and makes the process more efficient.

Validated Performance in Real-World Applications: AutoSCOPE has also been tested on existing products to ensure it works well in real-world products. The results were better or as good as the released models but were achieved much more efficiently. This proves that despite simplifying the development process, AutoSCOPE does not compromise on the quality or effectiveness of the models.

Comparing Model Performance: AutoSCOPE vs. Released Models — AutoSCOPE shows superior performance in real-world applications

What’s Next for AutoSCOPE?

Looking ahead, our focus will be on enhancing AutoSCOPE’s capabilities through the development of a “closing the loop” system. This next phase is crucial as it incorporates active learning and label refinement to significantly improve the overall performance and accuracy of our models.

Active Learning: This approach will concentrate our training efforts on the problematic data points where the AI currently struggles. By identifying and prioritizing these challenging areas, we can make our AI models more robust and capable of handling a wider variety of scenarios.

Label Refinement: Annotating histology images is complex and prone to inaccuracies. Imperfect annotations can lead to a decrease in model performance because the training data may not accurately represent the true features of the images. By correcting and updating labels, we ensure our AI trains on accurate data which could significantly boost the accuracy.

Illustration of the closing loop process: Green shows the current training scheme, blue highlights active learning on challenging data, and red indicates label refinement for improved accuracy

Enhancing data quality is fundamental to improving model performance. Given the extensive range of repetitive tasks involved in refining data and labels, manual handling is not only inefficient but also prone to errors. Automating this pipeline within AutoSCOPE will not only streamline these processes but also add substantial value to the development of scope models.

As we continue to refine and expand AutoSCOPE’s capabilities, we are excited about the future possibilities it holds. If you’re passionate about building and optimizing deep learning system and want to work on cutting-edge technology that’s making a big impact in the industry, consider joining Lunit’s team!

Acknowledgements

Many thanks to the developers Gihyeon Lee, Aaron Valero, Geonwoon Jang, and Keunhyung Chung for making AutoSCOPE possible. Also, thank you to the MCAI, DCAI, MPM, DM, BMRS, AIP, and PD teams for their helpful feedback in making our model development process better.