Coping with Insider Threats using Machine Learning

Moazzam Khan
allaboutsecurity
Published in
6 min readJan 14, 2022
Photo by Sara Kurfeß on Unsplash

Insider threat has always been a difficult to handle issue, even before computers you could always defend against enemies from the outside but it was very difficult to cope with enemies within. In recent days Insider threat problem has been compounded due to work from home trend which has brought in several additional variables into the mix such as users bring in unauthorized devices on to the corporate network, use of unsecure local networks, less adherence of security policies so there are lot more avenue where a user could inadvertently and unwittingly become an insider threat besides several intentional insider threats.

A recent study by IBM reports that there were 4761 insider threats incidents reported across 204 organizations which cost more than $11 billion dollars. Top culprits were the negligence from employees (63%), criminal insiders (14%), credential theft (23%). Typically, it takes 2 months to contain these insider threat incidences. Considering the cost and the time it takes to recover from them and the harm it does to the reputation of the organization, it is best that preemptive measures are taken to detect an insider threat before they are able to do damage.

There are several steps that can be taken such as detecting suspicious insiders and further investigate them, strong Identity and Access Management (IAM), regular user training, updated policies to reduce these threats.

In this article we will focus on detecting insider threats. When it comes to detecting insider that you can have a rule-based detection approach and ML based approach.

Rule Based

In rule-based detection you have defined rules such that if certain events match those rules an alert is generated, and you can assign some risk score to the user associated with that event. For example, if a user attempts to log into a server he is not authenticated, or a user attempts to access a network resources after hours, or if a user frequently and in short time changing geo locations. This approach works fine as long as we can formulate a rule and match events against that rule. But there are situation where creating such a rule isn’t possible for example browsing pattern of a user, working hours for a user, geo location of a user etc. All of the mentioned behaviors change from person to person and even for the same person and if that behavior changes slightly the rules fail to detect.

ML Based

We can overcome the variance in user behavior by identifying what is considered to be a normal behavior, anything that deviates from the normal is considered abnormal and should be treated as risky. ML helps us to create such a model of normal behavior for a user and identify outliers as anomalies. There are several techniques for detecting anomalies using ML and they can be broadly classified in Supervised and Unsupervised.

Supervised

In supervised anomaly detection you have the labels available of what is anomalous and what is normal and you train a model based on that dataset. In other words, this technique works better for known unknowns. And as the pattern of activity changes than what is available in the labels this technique falters. Examples of such techniques are support vector machines, k-nearest neighbors (KNN), Decision trees, Bayseian networks.

Cons

Domain expertise needed for Labeling which is manual and time-consuming process.

If the labelled data is bad in the first place than too many false positives.

Can’t capture variations in pattern than what is captured in the labels.

Pros

Captures well defined/ understood anomalies

Unsupervised

Unsupervised is more challenging because you don’t have any prior information on which event is anomalous and the algorithm builds a model as it analyzes data. clustered using a similarity measure and the data points which are far off from the cluster are considered to be anomalies.

Examples of such techniques are k-means, isolation forest, one class SVM, LSTM.

A Case study of ML for Insider Threat Detection

As case study of an insider threat, we take a look at Machine Learning techniques in IBM Qradar UBA application. Models can be classified into two broad categories. One is based on the time series and the other is peer groups.

Time Series (Uses correlation)

As the name suggest it uses timed sequence of events and try to detect if a user’s pattern of activity changes without a significant change in volume when compared to rest of the data in the time series. The pattern in the time series could be unexpected drops/spikes, seasonal or cyclical changes. Following is a time series based normal signal with its anomalous version.

Following is an example of analytics that are generated using this technique

· Aggregated Activity

Shows actual and expected user activity behavior patterns. The actual values are the number of events for that user during the selected time period. The expected values are the predicted number of events for that user during the selected time period. A red circle indicates that an anomaly was detected, and a sense event was generated by machine learning.

Peer Group (Uses LDA)

In this detection technique first clusters are created representing the characteristic behavior of a group of users than the activity of a user is compared with cluster it should match. If it deviates significantly, it is considered an anomaly. For example, detect when a user starts accessing assets which may include servers, websites, tools that his peer group does not access.

There are two steps involved

1. Training Phase

This phase identifies groups of users working with the similar set of assets

2. Online Phase

In this phase we examine the assets to identify deviations. Deviations are the access to the assets that are not in the group in which the user belong.

Some examples that use this technique.

· Defined peer group:

Defined group is a user grouping such as users’ job title, department, city chosen in the model settings. Behavior detected as are the groups the user behavior was similar to during the day. Deviation from peer group signifies the percentage a user has deviated from their defined peer group. Confidence is based on the amount of data gathered to build the model from users in the group to make accurate predictions. An alert is triggered if the deviation and the confidence both exceed their thresholds

· Activity distribution

Shows dynamic behavior clusters for all users that are monitored by machine learning. The clusters are inferred by the activity categories for all users that are monitored by machine learning. The actual values are the percent match to that cluster. The expected values are the predicted percent match to that cluster. Users typically have consistent activity patterns across time, but malicious behavior can manifest as changes in the pattern

Insider threats is the most challenging problem faced by organizations because it can do irreparable damage to their reputation and to their finances. It is also a very hard problem because a user’s behavior can’t be bound under set of rules because there are many legitimate reasons when a user won’t align with the rules and still be not malicious. This is where the ML approach is successful because we can detect an activity pattern looking at the past behavior of the user and develop a model from it.

--

--