Azure Sentinel: advanced multistage attack detection — real machine learning for the real world

Maarten Goet
Wortell
Published in
6 min readDec 6, 2019

Microsoft is touting that they are offering machine learning as part of Azure Sentinel, something they call Azure Sentinel FUSION. I’ve written about it before here, and since general availability of Azure Sentinel it is enabled by default.

You could easily be tricked into thinking that FUSION is marketing bingo, but nothing is truer: there are real machine learning models that help you in real world situations. One of the first that became available is named the “Advanced Multistage Detection”. It was built on six+ years of experience with building machine learning modules for services such as Azure AD Identity Protection and such.

I got to meet with the Machine Learning team part of Microsoft’s Threat Intelligence Center, developing those models. Here’s what I learned.

Azure Sentinel FUSION

FUSION is described as “Unlock the power of AI for security professionals by leveraging MS cutting edge research and best practices in ML, regardless of your current investment level in ML.” Microsoft is basing their work on two pillars:

(1) Sifting through tons of alerts in a SIEM is not something security analysts love doing. Their skill set can also be better put to work to hunt for bad actors, based on pre-filtered signals.

(2) Secondly, it is well known that security analysts are drowning in those alerts and sometimes miss the critical piece to launch to the next step of investigation.

Multi-stage attack detection

Going into the Analytics section of Azure Sentinel you’ll find a rule called ‘Advanced Multistage Attack Detection’. It has the following description:

By using Fusion technology that’s based on machine learning, Azure Sentinel can automatically detect multistage attacks by combining anomalous behaviors and suspicious activities that are observed at various stages of the kill-chain. Azure Sentinel then generates incidents that would otherwise be very difficult to catch.

By design, these incidents are low volume, high fidelity, and high severity. This is also why this detection is turned ON by default in Azure Sentinel. It will literally be the first Active Rule that will be in your new Azure Sentinel environment — making the machine learning promise real.

What does it exactly detect?

The question I get most is: “but what does it exactly detect?”. Microsoft just updated the documentation page, which can be found here.

The detections can be categorized in the following buckets:

- Impossible travel to atypical location followed by anomalous Office 365 activity- Sign-in activity for unfamiliar location followed by anomalous Office 365 activity- Sign-in activity from infected device followed by anomalous Office 365 activity- Sign-in activity from anonymous IP address followed by anomalous Office 365 activity- Sign-in activity from user with leaked credentials followed by anomalous Office 365 activity

Here’s a specific example of a detection where the machine learning model would trigger on:

An alert gets raised that is an indication of a sign-in event by <account name> from an anonymous proxy IP address <IP address>, followed by a suspicious inbox forwarding rule was set on a user’s inbox.This may indicate that the account is compromised, and that the mailbox is being used to exfiltrate information from your organization. The user <account name> created or updated an inbox forwarding rule that forwards all incoming email to the external address <email address> shortly after.

Connectors

To get the machine learning model to work, it needs to be fed with the ‘right’ data. In the case of the ‘advanced multistage detection’ model it needs data from both Azure Active Directory Identity Protection and Microsoft Cloud App Security.

PRO TIP: Make sure that you have the Azure Active Directory Identity Protection and Microsoft Cloud App Security connectors enabled!

MITRE ATT&CK

The MITRE ATT&CK framework is a comprehensive matrix of tactics and techniques used by threat hunters, red teamers, and defenders to better classify attacks and assess an organization’s risk. Azure Sentinel and many other security products have adopted this framework and map alerts to the tactics and TTP’s of MITRE.

The activities that the ‘advanced multistage detection’ machine learning model will detect, map to TTP’s in the following MITRE tactics:

· Persistence· Lateral Movement· Exfiltration· Command and Control

The team behind Azure Sentinel FUSION

I was very privileged to meet the team behind Azure Sentinel FUSION in their building in Redmond USA, during a customer security event. Ram Shankar Siva Kumar is leading the team, who is also an affiliate of the Berkman Klein Center at Harvard, together with lead data scientists such as Lily Ma. Ram is also the founder of the Security Data Science Colloquium — the only avenue where security data scientists from every major cloud provider congregate.

While previously working on models behind for instance Azure Active Directory Identity Protection, the team now also develops the models for Azure Sentinel FUSION. Sometimes they can re-use the years of model development / tuning like the one for anomalous logons (based on their work in the Azure and Microsoft 365 platforms), but for most they will be developing new models based on the detections that are in highest demand.

A whopping 35+ security-focused data scientists are working on machine learning models for core Microsoft platforms and Azure Sentinel! EPIC.

What’s next

While having real machine learning models in Azure Sentinel already helps up your defenses significantly, Microsoft is working on bring-your-own-ML functionality for Azure Sentinel. It’s currently in private preview, expected to release early 2020. It allows you to bring your own machine learning model to the party using Azure Databricks. I’ve had the opportunity to test and it is very promising!

Even more powerful is a third option that Microsoft is pursuing: extending the existing models that you get in the box. A sort of hybrid between developing your own from scratch, and just using what is there. A great example would be to extend the existing anomalous logons ML model with data from your badge / key access system. This could even further reduce false positives by understanding if somebody is physically present at a certain location.

Conclusion

Microsoft is investing a lot of money, time and energy into building real machine learning. With the first machine learning models coming into Azure Sentinel FUSION it is clear that it is solving real problems.

Happy hunting!

— Maarten Goet, MVP & RD

--

--

Maarten Goet
Wortell

Microsoft MVP and Microsoft Regional Director.