Catalyst — A PyTorch Framework for Accelerated Deep Learning R&D

Catalyst Team
May 11 · 7 min read

Authors: Sergey KolesnikovCatalyst Team Lead.
Acknowledgments: Catalyst team and collaborators.

During the last decade, the Deep Learning progress led to various projects and frameworks. One of the most famous among researchers became the PyTorch one. Thanks to its pure pythonic way of executing and great low-level design, it gathered a lot of attention from the research community. Nevertheless, with great power comes great responsibility: due to such low-level functions, users are likely to introduce bugs during the research. Moreover, with the rise of hardware accelerators, it became crucial to have a simple API to operate with different hardware setups efficiently.

For the last three years, Catalyst-Team has been working on Catalyst — a high-level PyTorch framework for Deep Learning Research and Development. It focuses on reproducibility, rapid experimentation, and codebase reuse so you can create something new rather than write yet another train loop. You get metrics, model checkpointing, advanced logging, and distributed training support without boilerplate code and low-level bugs.

“Write code with PyTorch, accelerate it with Catalyst!”

In this post, I would like to share our vision on high-level Deep Learning framework API and show current development progress on various examples.

Deep Learning recap

You have your experiment with predefined stages, epochs, and data sources, which you iterate and feed the model with some data batches, running the SGD update. It looks very straightforward, but everything becomes complicated when the project grows and requires more deep learning tricks, like advanced metrics or hardware accelerators.

Catalyst

Runner

Runner abstract code

The Runner has the most crucial role in connecting all other abstractions and defining the whole experiment logic into one place. Most importantly, it does not force you to use Catalyst-only primitives. It gives you a flexible way to determine the level of high-level API you want to get from the framework.

For example, you could:

Finally, the Runner architecture does not depend on PyTorch, providing directions for adoption for Tensorflow2 or JAX.
Supported Runners are listed under the Runner API section.

Engine

Engine abstract code

Thanks to the Engines’ design, it’s straightforward to adapt your pipeline for different hardware accelerators. For example, you could easily support PyTorch distribute setup, Nvidia-Apex setup, or AMP distributed setup. We are also working on other hardware accelerators support like DeepSpeed, Horovod, or TPU.
You can watch Engines development progress under the Engine API section.

Callback

The Callback API repeats main for-loops in our train-loop abstraction:

Callback abstract code

You can find all supported callbacks under the Callback API section.

Metric

Metric abstraction code

You can find all supported metrics under the Metric API section.

Catalyst Metric API has a default update and compute methods to support per-batch statistic accumulation and final computation during training. All metrics also support update and compute key-value extensions for convenient usage during the run — it gives you the flexibility to store any number of metrics or aggregations you want with a simple communication protocol to use for their logging.

Logger

Logger abstract code

With such a simple API, we already provide integrations for Tensorboard and MLFlow monitoring systems. More advanced loggers for Neptune and Wandb with artifacts and hyperparameters storing are in development thanks to joint collaborations between our teams.
All currently supported loggers can be found under the Logger API section.

Examples

PyTorch way — for-loop decomposition with Catalyst

CustomRunner — PyTorch for-loop decomposition

Python API — user-friendly Deep Learning R&D

Linear Regression
Hyperparameters optimization with Optuna

All the above examples help you write fully compatible PyTorch code without any external mixins. No custom modules or datasets required — everything works natively with PyTorch codebase, while Catalyst links it together in a more readable and reproducible way.

For more advanced examples, like GANs, VAE, or multistage runs (another unique feature of the Catalyst), please follow our minimal examples section.

The Catalyst Python API supports various user-friendly tricks, like overfit, fp16, ddp, and more, to make it easy for you to debug and speed up your R&D. To read more about all these features, please follow our .train documentation. A minor example for your interest:

full-featured MNIST example in only 60 lines of code

Config API — from research to production

Config API

Config APIs examples can be found here. As you can see, the Config API fully repeats Runner specification in a YAML-based way, allowing you to change any part of your experiment without any code changes at all.

Thanks to such hyperparameters storage, it’s also very easy to run hyperparameters optimization with catalyst-dl tune. You could find an example for catalyst-dl tune under the Config API minimal example section. Once again, you could tune any part of your experiment with only a few lines change in your YAML file. That’s it, so simple.

During the last 3 years, we have done enormous work for accelerating Deep Learning RnD in a purely open-source ecosystem way thanks to our team and contributions. In this post, we have covered current framework design principles and a few minimal examples, so you could speed up your Deep Learning with Catalyst and make it fully reproducible.

If you are interested in Catalyst usage:

If you are interested in Catalyst development:

If you are motivated by our Catalyst open-source Deep Learning RnD ecosystem vision, you could support our initiative here or write directly to team@catalyst-team.com for collaboration.

PyTorch

An open source machine learning framework that accelerates…