Catalyst dev blog - 20.07 release
Hi, I am Sergey, the author of the Catalyst — PyTorch library for deep learning research and development. In our previous blog posts, we covered an introduction to the Catalyst and our advanced pipeline for NLP on BERT distillation. In this post, I would like to share with you our development progress for the last month. Let’s check what features we have added to the framework in such a short time.
tl;dr
- Training Flow improvements: BatchOverfitCallback, PeriodicLoaderCallback, ControlFlowCallback
- Metric Learning features: InBatchSampler, AllTripletsSampler, HardTripletsSampler, tutorial
- Fixes and acknowledgments
- New integrations: MONAI & Catalyst
- Ecosystem update — Alchemy
You can find all examples from this blog post on this Google Colab,
Training Flow improvements
BatchOverfitCallback
For better user experience with deep learning, you need to think not only about cool engineering features like distributed support, half-precision training, and metrics (we already have them). You also have to think about common difficulties that occur during experimentations.
Imagine a typical research situation: you wrote your fancy pipeline, got the dataset, and try to fit this data into your model. But something goes wrong and you can’t get desired results.
One of the potential causes — there is a problem with pipeline convergence. You could subsample your data and check that model easily overfits only on this subset. But do it again and again along all your projects? Looks like we need a general solution for this problem. And here comes our BatchOverfitCallback (contributed by Scitator). The idea behind it is straightforward— let’s take only a requested number of batches from your dataset and use only them for training.
So, let’s check some deep learning pipeline,
You can run it with
Thanks to the update, you can check your pipeline convergence with only one extra line
This way you can easily debug your experiment without extra code refactoring. You could also redefine the required number of batches per loader.
What is even cooler, we have integrated this feature into our Config API. You can use it with
catalyst-dl run --config=/path/to/config --overfit
PeriodicLoaderCallback
During your research practices, you could find yourself in the situation, when you have only a few train samples and a huge test set to check your model performance. Alternatively, you could have computational heavy validation (for example, during the NMS stage on anchor-box object detection) that takes too much time of your training pipeline. You can increase the train set for each epoch with BalanceClassSampler, but what if you want to keep your training data unchanged? Try our new PeriodicLoaderCallback (contributed by Ditwoo).
For the example above you can set a validation run every 2 epochs:
Thanks to Catalyst design, we could extend it for any number of your data sources:
ControlFlowCallback
After PeriodicLoaderCallback we asked ourselves: “If you can enable/disable data sources, why can’t you do the same with metrics and entire Callbacks?”. For example, you have a metric you don’t want to compute during the training or validation stage. With ControlFlowCallback (contributed by Ditwoo) it could be done easily:
Now you can define with which loaders and epochs you would like to use Callback, or ignore it.
Metric Learning features
I also want to make a preview of extra updates during this release. For the last month we were working hard developing a foundation for Metric Learning research. We have prepared several InBatchTripletsSamplers (contributed by AlekseySh) — helper modules for online triplets mining during training,
- AllTripletsSampler to select all possible triplets for the anchors
- HardTripletsSampler to select the hardest triplets based on distances between samples
We hope these abstractions would help in your research. We are working on Metric Learning minimal example now to create a starting benchmark for this case. Stay in touch for the upcoming tutorial.
Fixes
Last but not least, as with every release, this one was with a few fixes,
- thanks to Oleksii Sliusarenko we fix our “first epoch” issue with EarlyStoppingCallback
- with Lokesh Nandanwar support we make our OneCycleLRWithWarmup great again
- a number of Github and catalyst-codestyle improvements by Yauheni Kachan
Integrations — MONAI segmentation example
In collaboration with the MONAI team, we have prepared an introduction tutorial on 3D image segmentation with the MONAI and Catalyst framework.
Plans
We still have a lot of plans:
- TPU support — with current cpu, gpu, and Slurm support, we want to push the frontiers and get Catalyst to the fancy TPU
- kornia integration — we already have a native integration with the famous albumentations library, but… why should not we make a fair comparison between alternatives and take the best for our customers? Stay in touch for an upcoming benchmark on image augmentation libraries benchmark by Catalyst-Team
- model auto-pruning — as far as Catalyst is a framework for deep learning research and development, and we already support model auto-tracing, we want introduce framework support for models auto-pruning.
Ecosystem release — Alchemy
During this Catalyst release, we also have another great new — we are moving our ecosystem powered monitoring tools to the global MVP release. Feel free to use it and share your feedback with us.
We help researchers to accelerate pipilines with Catalyst and to find insights with Alchemy along the whole R&D process: these ecosystem tools are available for you to train, share and collaborate more effectively.
Afterword
Our goal is to build a foundation for fundamental breakthroughs in deep learning and reinforcement learning areas. Nevertheless, it is really hard to build an Open Source Ecosystem with only a few motivated people. If you are a company that is deeply committed to using open source technologies in deep learning, and want to support our initiative, feel free to write us at catalyst.team.core@gmail.com. For details about Ecosystem, check our vision and manifesto.