Continuum: A Data Loader for Continual Learning
Continual Learning, Incremental Learning, Lifelong Learning, or Online Learning are similar research fields which aim at learning an ever-growing amount of knowledge while trying to forget the less as little as possible (More details here).
One implementation difficulty in those fields is how to create the stream of data that feed the new knowledge to an algorithm. **Continuum** aims to make it simple and to avoid problems with data loader for researchers. The goal is to not waste time anymore to reproduce the continual learning settings, and starts directly to work on the algorithm.
Continuum proposes different existing scenarios. Moreover, it is developed such as making it easy to create your personnal dream scenarios.
Here is a short presentation :
Continuum is available on PiPy plateforms, it can be installed with:
pip3 install continuum
Continuum project is also available here.
To create continual learning scenarios, Continuum decompose the data management into three levels of data structures: Datasets, Tasksets, Scenarios
- Datasets: Datasets are the raw data that will be used to create tasks and scenarios.
- Tasksets: The taskset contains the data specific to a task. The data are selected from the original dataset and eventually transformed.
- Scenarios: A scenario is a sequence of tasks. It composes the curriculum of learning experience fed to the algorithms.
A simple example to understand the organization of Continuum scenarios. For more snippet of code, you can look at Continuum documentation.
Main Supported Scenarios:
Continuum supports various types of scenarios, but mainly it can be for most scenarios of the continual learning literature.
- Classes Incremental scenarios (similar to disjoint/new classes/split scenarios from the literature )
- Transformation Incremental e.g. Permutation MNIST, Rotation MNIST
- More scenarios in Continuum documentation
Continuum supports all the basic datasets from pytorch.datasets (MNIST, CIFAR10, CIFAR100) as well as larger datasets such as ImageNet or CORe50. We provide also tools to create manually new datasets. For example, the fellowships class make possible to concatenate several datasets into one for specific scenarios. You can find a complete list of supported datasets here.
Continuum is an open -ource project which aims at simplifying data management for continual learning algorithms. It is developed such as being easily adaptable to specific needs. If you have an idea of new scenarios that should be added don’t hesitate to put an issue or a pull request to Continuum Github Repository.
Continuum is made to save you time, reduce code size in your project, and avoid you dev problems! We hope you will enjoy it :)
Arthur Douillard, PhD Student @ Sorbonne + Research Scientist @ Heuritech
Timothée LESORT, Postdoctoral Researcher @ MILA