FuseMedML — a framework accelerating AI-based discovery and code reuse in the biomedical field

Moshe Raboh
PyTorch
Published in
5 min readSep 5, 2022

Research is very exciting, always at the edge between the known and unknown.

It seems pretty intuitive that speeding up research iteration cycles will result in faster and better progress on research projects, but how can we do that?

Motivation — “Oh, the pain!”

We analyzed many medical-related ML research projects — spanning multiple modalities (imaging, clinical data, genomics, and more) and tasks (classification, segmentation, clinical conditions prediction, and more).

Three key findings surfaced time after time:

  1. Launching a new project (baseline implementation) was taking too long, even when very similar projects were already done in the past by the same lab. It didn’t matter what “hot” framework we used — PyTorch Lightning, fast.ai — everything we could lay our hands on.
  2. Porting individual components across projects was painful — resulting in researchers “reinventing the wheel” time after time.
  3. Collaboration between projects across modalities as well as across domains such as imaging and molecules was very challenging

So we created a solution, FuseMedML 😊

We will reference FuseMedML as fuse from now on for brevity.

fuse is a python framework accelerating ML-based discovery in the medical field by encouraging code reuse. Batteries included :)

Using fuse, we experienced significant improvement after using this tool internally¹ ² ³ ⁴ ⁵ ⁶, launching new projects took days instead of weeks, code components were suddenly reused across projects, and the collaboration between teams working on different domains was efficient.
We were also able to easily and correctly measure our project progress and the statistical significance of our results with off-the-shelf tools such as confidence interval and model comparison. These tools were also enablers to ISBI challenges we’ve⁷ recently organized (KNIGHT and BRIGHT).

It was obvious to us that we’ll open source it — as we felt that everybody should enjoy it.

How the magic happens

Three key concepts make fuse so flexible and encourage code reuse.

  1. Flexible data structure
    It’s a key driver of flexibility and it enables easy multi-modality fusion. Data is kept in a nested (hierarchical) dictionary.
    A single instance can represent a single sample (denoted as sample_dict) a minibatch (denoted as batch_dict), or an entire epoch results (denoted as epoch_results).
  2. Decoupled components
    The components kept decoupled from one another and from the entire project — including the structure of the data and the model.
    To achieve it, each component is implemented in a way that allows its user to define input and output keys which are used to read the input and store the output.
    For example:
    MetricAUCROC(pred=”model.output.scores”, target=”data.label”)
    Will compute AUC score given the predictions and target labels extracted from batch_dict given the keys specified in pred and targets.
  3. Batteries included
    fuse comes with a collection of pre-implemented components (aka the batteries) that can be used to quickly build your required ML pipeline.
    In our vision, since the components in fuse are generic, each project using fuse will contribute and enrich the collection of the pre-implemented components.
Evaluation output for KNIGHT ISBI challenge as evaluated by fuse

Let’s look at a concrete example of training a skin lesion classifier given dermoscopic images. More details about the data and task can be found here, the complete source code can be found here.

Data Pipeline

To implement the data processing pipeline, we use fuse data package. It allows us to build an extremely flexible pipeline with reusable building blocks.
The data processing pipeline is built from a sequence of op(eration)s — the building blocks. Each operation gets as an input a sample_dict, a dictionary that stores all the necessary information about a sample processed so far. An operation typically gets also the keys specifying the location in sample_dict of the input to consider and where to store the output.
We split the pipeline into two parts: the first part is calledstatic_pipline and the second,dynamic_pipeline .
The output of static_pipeline is cached to optimize the running time and maximize GPU utilization.

The dynamic pipeline continues the processing and typically includes operations we want to experiment with or operations with random behavior such as augmentations.

Once we have the static and dynamic pipeline — we can create a Dataset.

fuse comes with generic utilities such as batch balancing, splitting the data to folds according to some criteria, and more:

Model
fuse works with any PyTorch model — the only difference is that the model is expected to extract its input from a batch_dict and to save its output into the dictionary. You can either implement a PyTorch model that does it or wrap it with ModelWrapSeqToDict

Train

Before you ask — PyTorch-lightning and fuse play along very nicely and have in practice orthogonal and additive benefits.

Using fuse, you’ll quickly get an up-and-running training pipeline based on Pytorch Lightning. fuse offers Lightning Module implementation that fits many common cases while it paves the way for more unique cases.

Inference and Evaluation

The evaluation package of fuse (fuse.eval) is a standalone library for evaluating ML models for various performance metrics.
It comes with advanced utilities such as confidence interval, model comparison, model calibration, and more.

Here is a simple example of computing AUC with confidence interval:

If you want to know more and get involved, take a look at the repository: https://github.com/BiomedSciAI/fuse-med-ml. There you can find quick start instructions and end-to-end examples over public medical datasets.
Contributions are most welcomed. Check out also BiomedSciAI, a GitHub organization we’ve recently created that hosts biomedical science tools.

References

[1] Raboh M, Levanony D, et al. Context in medical imaging: the case of focal liver lesion classification. SPIE medical imaging 2022; https://doi.org/10.1117/12.2609385
[2] Rabinovici S, Tlusty T, et al. Early prediction of metastasis in women with locally advanced breast cancer. SPIE medical imaging 2022; https://doi.org/10.1117/12.2613169
[3] Rabinovici S, Fernandez M, et al. Multimodal Prediction of Five-Year Breast Cancer Recurrence in Women Who Receive Neoadjuvant Chemotherapy. Cancers, 2022; https://doi.org/10.3390/cancers14163848
[4] Jubran I, Raboh M, et al. A Glimpse into the Future: Disease Progression Simulation for Breast Cancer in Mammograms. SASHIMI (MICCAI) 2021; https://doi.org/10.1007/978-3-030-87592-3_4
[5] Tlusty T, Ozery M, et al. Pre-biopsy Multi-class Classification of Breast Lesion Pathology in Mammograms. MLMI (MICCAI) 2021;
https://doi.org/10.1007/978-3-030-87589-3_29
[6] Golts A, Khapun D, et al. An Ensemble of 3D U-Net Based Models for Segmentation of Kidney and Masses in CT Scans. KITS challenge (MICCAI) 2021; https://doi.org/10.1007/978-3-030-98385-7_14
[7] IBM Research, University of Minnesota, Cleveland Clinic, HES-SO, ICAR-CNR, IRCCS.

--

--

Moshe Raboh
PyTorch
Writer for

A computer vision and machine learning research scientist in the Multimodal AI for Healthcare at IBM Research.