The CyberSift Packet Capture Parser —What do we mean by “Anomaly Analysis”?

David Vassallo
3 min readAug 22, 2018

--

This article is part of a series on the CyberSift Packet Capture Parser. In this article we’ll introduce anomaly analysis in the context of cybersecurity and discuss which methods we did and did not integrate into the Packet Parser.

In essence, anomaly analysis is the ability to flag an event that deviates from pre-established normal behavior. In cybersecurity an “event” may be a windows event log, a firewall connection, a user login, or an application error code. It’s very important to note that just because an event is “different” it is not necessarily “bad” or malicious. Let’s take a concrete example:

Let’s say you have an algorithm monitoring the amount of visits to your web server (the solid green line). Your algorithm is able to somehow “predict” what the future amount of visits will be (the dotted line), within certain tolerance limits (the shaded green area). Any event that doesn’t fall within this prediction is flagged as an anomaly (the white circles). Those anomalies might be the result of a blog post going viral and hence receiving more visits than expected. Clearly, a “good” anomaly. However, it could also be a disgruntled ex-employee launching a DDoS against your site. Two very plausible, but very different explanations for the same anomaly.

Good threat hunting platforms such as CyberSift tackle this in a number of ways:

Monitor the right “metrics”.

In machine learning, the proper term is “feature engineering”, and it’s arguable the most important step in designing an algorithm to use. In other words, which features should we feed into our algorithms in order to get meaningful alerts? In our example above, our only feature was “number of visits”. But our model can become arguable more robust if we add “number of unique visiting IP addresses” and “size of packets”.

As we add more features, we need more advanced algorithms to detect outliers. Using a moving average or the 3-sigma rule might work for one feature, but quickly becomes unwieldy when you introduce multiple features. This is where algorithms like Support Vector Machines come into play.

Have multiple anomaly detection algorithms working together.

What one algorithm catches, another might miss. If the same anomaly is flagged by multiple algorithms, you have more confidence that this is not just your average alert. This concept is usually implemented using the concept of “ensemble learning” in machine learning algorithms

Leverage threat intelligence

Threat intelligence helps and analyst decide if an abnormality is actually a threat or not. If your system flags an anomalous event and this event involves an IP address which is included in a threat feed, then it’s very probably worth investigating.

Which anomaly detection algorithm does the packet capture parser use?

As opposed to the a full-blown CyberSift installation, the packet parser runs on very limited resources and hence contains a very basic algorithm to detect anomalies. While CyberSift relies on multiple algorithms such as Decision Trees, SVMs, and Neural Networks, the Packet Capture parser relies on a simple “percentile” rule

You may have seen “box plots” like the one on the left. Box plots show more information than a simple line chart because they include maximum, minimum and percentile ranges. A percentile is threshold that “encapsulates” a specific range of your data. For example, the 50th percentile is that threshold which is more than 50% of your data. The 90th percentile is a number which is just higher than 90% of your data, and so on.

The Packet Capture Parser employs the 95th percentile rule. It gathers varying features about the packet capture, such as number of SYN requests sent, or amount of bytes transferred, and subsequently calculates the 95th percentile threshold. Anything that falls above this threshold is marked as an anomaly. Simple, fast and lightweight — but still useful

We’ll discuss the features that are monitored by this anomaly detection algorithm in the rest of our article series.

--

--