Incremental learning refers to learning from streaming data, which arrive over time, with limited memory resources and, ideally, without sacrificing model accuracy
This setting fits different application scenarios such as learning in changing environments, model Customization, or lifelong learning, and it offers an elegant scheme for big data processing by means of its sequential treatment.
Classic Machine Learning vs Incremental Learning:
Classic Machine learning methods offer particularly powerful technologies to infer structural information from given digital data; still, the majority of current applications restrict to the classical batch setting: data are given prior to training, hence meta-parameter optimisation and model selection can be based on the full data set, and training can rely on the assumption that the data and its underlying structure are static.
Incremental learning, in contrast, refers to the situation of continuous model adaptation based on a constantly arriving data stream. This setting is present whenever systems act autonomously such as in autonomous robotics or driving. Further, online learning becomes necessary in interactive scenarios where training examples are provided based on human feedback over time. Many digital data sets can become so big that they are de facto dealt with as a data stream, i.e. one incremental pass over the full data set . Incremental learning investigates how to learn in such streaming settings.
Challenges in Incremental Learning:
Challenge 1: Online model parameter adaptation: In many application Data sets are not available priorly but arrives over time and task is to infer a reliable model out of after each increment. Incremental learning use training samples one by one, without knowing their number in advance, to optimise their internal cost function.
Cost function optimization can be done by:
- Fully online approaches that adapt their internal model immediately upon processing of a single sample.
- Mini-batch techniques that accumulate a small number of samples, to batch learning approaches, which store all samples internally.
Challenge 2: Concept drift: Changes in the data distribution over time are commonly referred to as concept drift.
Different types of concept drift can be distinguished:
- Changes in the input distribution only, referred to as virtual concept drift or covariate shift.
- Changes in the underlying functionality itself p(y|~x), referred to as real concept drift.
Real concept drift is problematic since it leads to conflicts in the classification, for example when a new but visually similar class appears in the data: this will in any event have an impact on classification performance until the model can be re-adapted accordingly.
Challenge 3: The stability-plasticity dilemma: how to adapt the current model.
- A quick update enables a rapid adaptation according to new information, but old information is forgotten equally quickly.
- On the other hand, adaption can be performed slowly, in which case old information is retained longer but the reactivity of the system is decreased.
The dilemma behind this trade-off is usually denoted the stability-plasticity dilemma.
One approach to deal with the stability-plasticity dilemma consists in the enhancement of the learning rules by explicit meta-strategies, when and how to learn.
Challenge 4: Adaptive model complexity and meta-parameters:
For incremental learning, model complexity must be variable, since it is impossible to estimate the model complexity in advance if the data are unknown.
- Depending on the occurrence of concept drift events, an increased model complexity might become necessary.
- On the other hand, the overall model complexity is usually bounded from above by the limitation of the available resources. This requires the intelligent reallocation of resources whenever this limit is reached
Challenge 5: Efficient memory models:Due to their limited resources,incremental learning models have to store the information provided by the observed data in compact form.
This can be done via suitable system invariants (such as the classification error for explicit drift detection models ), via the model parameters in implicit form (such as prototypes for distance- based models), or via an explicit memory model
- Some machine learning models offer a seamless transfer of model parameters and memory models, such as prototype- or exemplar–based models, which store the information in the form of typical examples.
- Explicit memory models can rely on a finite window of characteristic training examples, or represent the memory in the form of a parametric model.
Challenge 6: Model benchmarking: There exist two fundamentally different possibilities to assess the performance of incremental learning algorithms:
- Incremental -vs- non-incremental: In particular in the absence of concept drift, the aim of learning consists in the inference of the stationary distribution p(y|x) for typical data characterised by p(x). This setting occurs e.g. whenever incremental algorithms are used for big data sets, where they compete with often parallelized batch algorithms.In such settings, the method of choice evaluate the classification accuracy of the final model on a test set, or within a cross-validation.
- Incremental -vs- incremental:When facing concept drift, different cost functions can be of interest. Virtual concept drift aims for the inference of a stationary model p(y|x) with drifting probability p(x) of the inputs.In such, settings, the robustness of the model when evaluated on test data which follow a possibly skewed distribution is of interest.
Note: It has been shown, as an example, that incremental clustering algorithms cannot reach the same accuracy as batch versions if restricted in terms of their resources.
Application of Increment Learning:
Data analytics and big data processing: There is an increasing interest in single-pass limited-memory models which enable a treatment of big data within a streaming setting.
Robotics: Autonomous robotics and human-machine-interaction are inherently incremental, since they are open-ended, and data arrive as a stream of signals with possibly strong drift. Incremental learning paradigms have been designed in the realm of autonomous control , service robotics , computer vision , autonomous driving.
Image processing: Image and video data are often gathered in a streaming fashion, lending itself to incremental learning
Automated annotation: One important process consists in the automated annotation or tagging of digital data. This requires incremental learning approaches as soon as data arrive over time.
Outlier detection: Automated surveillance of technical systems equipped with sensors constitutes an important task in different domains, starting from process monitoring , fault diagnosis in technical systems , up to cyber-security .
Incremental learning algorithms and applications by Alexander Gepperth, Barbara Hammer.