Continuum: A Data Loader for Continual Learning

Published in

ContinualAI

3 min readSep 3, 2020

Continual Learning, Incremental Learning, Lifelong Learning, or Online Learning are similar research fields which aim at learning an ever-growing amount of knowledge while trying to forget the less as little as possible (More details here).

One implementation difficulty in those fields is how to create the stream of data that feed the new knowledge to an algorithm. **Continuum** aims to make it simple and to avoid problems with data loader for researchers. The goal is to not waste time anymore to reproduce the continual learning settings, and starts directly to work on the algorithm.

Continuum proposes different existing scenarios. Moreover, it is developed such as making it easy to create your personnal dream scenarios.

Here is a short presentation :

Installation

Continuum is available on PiPy plateforms, it can be installed with:

pip3 install continuum

Continuum project is also available here.

Organization

To create continual learning scenarios, Continuum decompose the data management into three levels of data structures: Datasets, Tasksets, Scenarios

Datasets: Datasets are the raw data that will be used to create tasks and scenarios.
Tasksets: The taskset contains the data specific to a task. The data are selected from the original dataset and eventually transformed.
Scenarios: A scenario is a sequence of tasks. It composes the curriculum of learning experience fed to the algorithms.

Example:

A simple example to understand the organization of Continuum scenarios. For more snippet of code, you can look at Continuum documentation.

Example for Split MNIST: 5 tasks, 2 classes per tasks

Main Supported Scenarios:

Continuum supports various types of scenarios, but mainly it can be for most scenarios of the continual learning literature.

Classes Incremental scenarios (similar to disjoint/new classes/split scenarios from the literature )
Transformation Incremental e.g. Permutation MNIST, Rotation MNIST
More scenarios in Continuum documentation

Samples from the code in the example section: Split MNIST 5 tasks.

Samples from Rotation MNIST with 5 tasks from 0 degrees to 90.

Samples from imagenet100, class incremental, 50 + 50

Supported Datasets:

Continuum supports all the basic datasets from pytorch.datasets (MNIST, CIFAR10, CIFAR100) as well as larger datasets such as ImageNet or CORe50. We provide also tools to create manually new datasets. For example, the fellowships class make possible to concatenate several datasets into one for specific scenarios. You can find a complete list of supported datasets here.

Conclusion

Continuum is an open -ource project which aims at simplifying data management for continual learning algorithms. It is developed such as being easily adaptable to specific needs. If you have an idea of new scenarios that should be added don’t hesitate to put an issue or a pull request to Continuum Github Repository.

Continuum is made to save you time, reduce code size in your project, and avoid you dev problems! We hope you will enjoy it :)

Arthur Douillard, PhD Student @ Sorbonne + Research Scientist @ Heuritech
Timothée LESORT, Postdoctoral Researcher @ MILA

Continuum: A Data Loader for Continual Learning

Installation

Organization

Example:

Conclusion

Written by Timothee L