Online ML Algorithms

Chituyi
3 min readSep 8, 2023

--

Why Not Online Algorithms?

Online Algorithms

Online machine learning is a type of machine learning where the model is updated incrementally as new data becomes available. This contrasts with batch machine learning, where the model is trained on a fixed dataset (norm).

I always think about SGD whenever I come across Online or Incremental learning concepts being discussed. Quite frankly the industry is still shy to exploit these algorithms because they are just risky in production. However, i think the trick is to combine Batch (have a base model that can regulate model outputs to close to the expected) and Online (Learning from new streams but considering the past outputs when calculating the final output) -doing potentially reduces the competitive ratio and put a human in the loop to monitor and penalize the model in any case.

Consider a Price Discriminatory strategy for a water rationing plan features such as Day, Time of day, Area, and Weather how much should you charge? This kind of intelligence due to exploiting business data can ensure your business has a competitive edge by increasing customer loyalty due to timely access of resources while at the same time increase customer acquisition.

Play with the tool here! 🔛

https://psag.onrender.com/prediction

Assumption: The data is being streamed into the classifier!

Early online machine learning algorithms include the Winnow algorithm (Uses a constant multiplicative factor to adjust feature weights, weights are initialized by 1) and the Adaline algorithm (Uses delta rule — variant of back propagation to adjust feature weights which are initialized stochastically).

Some of the most common online machine learning algorithms include:

  • Stochastic gradient descent (SGD): SGD is a general-purpose optimization algorithm that can be used for both online and batch machine learning. It is a simple but effective algorithm that is often used for training neural networks.
  • Adaptive boosting (AdaBoost): AdaBoost is an ensemble learning algorithm that combines multiple weak learners into a strong learner. It is a popular online machine learning algorithm for classification problems.
  • Passive-Aggressive: The algorithm works by maintaining a hypothesis (or prediction model) and a confidence parameter.
  • Exponential weighted moving average (EWMA): EWMA is a smoothing technique that can be used to track the mean of a time series. It is a simple but effective online algorithm that is often used for anomaly detection.

Training Passive Aggressive Classifier:

Why the Passive approach to learning? The model made the correct prediction. No need to change any weights.

Why the Aggressive approach to learning? The model made a wrong prediction and needs to be penalized. Change the weights.

Productionizing:

Online machine learning algorithms are used in a variety of production applications, including:

  • Recommendation systems: Used to recommend products, movies, and other items to users.
  • Spam filtering: Used to filter out spam emails.
  • Fraud detection: Used to detect fraudulent transactions.
  • Robotics: Used to control robots, by learning from their interactions with the environment.
  • Anomaly detection: Used to detect anomalies in data, such as intrusions or equipment failures.

The advantages of using online machine learning algorithms include:

  • They can be used to learn from data streams, which are data that is continuously generated.
  • They can be used to update the model in real time, which allows the model to adapt to changes in the data.
  • They are less prone to overfitting than batch machine learning algorithms.

The disadvantages of using online machine learning algorithms include:

  • They can be slower than batch machine learning algorithms, especially for large datasets.
  • They require strict supervision.
  • They can be more difficult to tune than batch machine learning algorithms.
  • They can be less accurate than batch machine learning algorithms for some problems.

I think F1 will be the best Metric to score an Online model. You just want to understand how it pits against the validated data. But most especially because of “catastrophic forgetting” just how much’ can it recall? 😊

Overall, online machine learning algorithms are a powerful tool that can be used to learn from data streams and adapt to changes in the data. However, they are not always the best choice, and it is important to consider the specific problem when choosing an algorithm.

Check out free ML projects with Code to get started here!

https://dallo7.github.io/

#MLDemocratizer!

😊🤗

--

--

Chituyi

Building data Pipelines for ML and AI to aid Supply Chain Agility and improve Customer Intimacy. https://dallo7.github.io/