These are the lesser known ways to classify a Machine Learning algorithm! In previous posts I’ve mentioned how to state your problem and ways to classify a ML system, however there is still to mention a way to classify a system by whether or not it can learn incrementally from a stream of incoming data.
Online vs Batch Learning
In Batch Learning (BL) a system cannot learn incrementally, it must be fed all the training data in order to generate the best model. This is usually very time consuming as well as demanding on computing resources (CPU, memory space, disk space, etc.), thus it is typically done offline. Therefore, the system is firstly trained with all the available data, then it is launched into production and runs without learning anymore, i.e. it just applies what it has learned: offline learning.
Imagine the data has been updated and you want your system to learn about these new examples, you need to train a new version of the system from scratch on the full dataset and then replace the older system with this new one.
On the other hand, Online Learning (OL) allows you to train your system incrementally by feeding it data instances sequentially, either individually or by small groups called mini-batches. Since each learning step is fast and cheap, the system can learn about new data very quickly.
In opposite to BL, this type of systems are great when data is being received in a continuous flow (e.g. stock price) and needs to adapt to change rapidly and independently. Moreover, it is also a great solution if you’re limited in computing resources since once the system has learned about the new data instances it does not need them anymore and you can discard them. Moreover, this system can also deal with huge datasets that will not fit in one computer’s main memory.
Instance-based vs Model-based Learning
Another way to categorise a ML system is by how they generalise. Most ML systems are expected to make predictions. This means that given a number of training examples, it is expected of the ML system to be able to generalise to examples it has never seen before.
There are two main approaches to generalisation : instance-based and model-based learning.
This system generalises by learning the examples by heart and then using a similarity measure. This means that the system instead of performing explicit generalisation, compares new problem instances with instances seen in training. Such a system normally does what is known as lazy learning by absorbing the training data instances and using those data instances for inference.
- K-nearest neighbor algorithm;
- Kernel machines;
- Radial basis function network.
In model-based learning the system creates a model of the environment in which it exists (using the available training data). Then it can use that same model to make predictions
If you liked it, follow me for more publications and don’t forget, please, give it an applause!