Anomaly detection with supervised learning

Sean Gahagan
2 min readOct 20, 2022

--

My last note looked at a specific type of unsupervised learning called K-means clustering. This week, we’ll look at how a unique type of supervised learning can be used for anomaly detection.

What can anomaly detection be used for?

Anomaly detection can be used for identifying faulty parts in manufacturing, flagging instances of fraud in financial services, or discerning inauthentic user behavior on websites. In other words, it can be used to determine when something is “not normal”, even when you may not know what “not normal” looks like yet.

How does anomaly detection work?

To develop an anomaly detection algorithm, you setup your data a bit differently than you would for typical supervised learning. Instead of including positive and negative examples in your training set, you’ll put all positive examples (i.e., anomalies) in your test and cross validation sets, while only including negative examples (i.e., normal examples) in your training set.

In building your data set, you’ll want to choose features that might indicate an anomaly (e.g., how long a user spends on each page of a website). It’s especially helpful if you can choose features that have very large or very small values in the event of an anomaly.

Then you’ll train your model by fitting the parameters of a probability density function (e.g., a gaussian distribution) to your training set of negative examples (i.e., non-anomalous “normal” examples).

Then, for a new example, you’ll use the model to calculate the probability of that example occurring within your distribution of “normal” examples.

If the probability is below a certain level (represented by the variable ε ), then your model will classify it as an anomaly. You can adjust ε based on how your model performs on the cross validation set of examples.

You can also look at your false negatives to help you ideate new features that might help you better identify anomalies.

Up Next

My next note will look at an example of how machine learning can be used to make content-based recommendations for people.

Past Notes in this Series

  1. Towards a High-Level Understanding of Machine Learning
  2. Building Intuition around Supervised Machine Learning with Gradient Descent
  3. Helping Supervised Learning Models Learn Better & Faster
  4. The Sigmoid function as a conceptual introduction to activation and hypothesis functions
  5. An Introduction to Classification Models
  6. Overfitting, and avoiding it with regularization
  7. An Introduction to Neural Networks
  8. Classification Models using Neural Networks
  9. An Introduction to K-Means Clustering

--

--