Analyzing your Microsoft Defender ATP data in real-time in ELK using the new streaming API

Maarten Goet
6 min readJul 23, 2019

--

Microsoft Defender ATP has a ton of information about users, their endpoints, their applications and processes, and network events that threat hunters can use in their investigations. There is Advanced Hunting functionality in MDATP that they can leverage to find information.

This is great, but often threat hunters are searching through multiple sources and combining information from all of them to aggregate signals and ‘paint a picture’ of what is going on. Microsoft already provides support for Jupyter notebooks today, where threat hunters can work with KQL, Microsoft’s new query language, to dig up data.

But what if you are using Elasticsearch, Logstash and Kibana (ELK)? Is there a way to source information from Microsoft Defender ATP to ELK and work with the data there? The MDATP team just released (a preview of) the Streaming API which allows you to do just that.

Threat hunting in MDATP

Microsoft Defender ATP has functionality for threat hunting called Advanced Hunting built in. It can be easily consumed through the web UI, but it is also available through the MDATP API. When you’re in the web UI, it takes KQL queries to search for data.

I wrote a blog earlier about hunting that you can use to get started on this.

Jupyter notebooks and MDATP

Jupyter Notebook, formerly called IPython, is an open-source web application that allows you to create and share documents that contain live code, data transformations, visualizations and narrative text through markdown. It is broadly used in infosec by threat hunters, and has support for lots of programming languages such as R, Python, etc. See my previous blog for an introduction to Jupyter in the context of Microsoft security.

John Lambert wrote python code late last year and published it in a sample notebook to hunt Microsoft Defender ATP data in Jupyter. John also wrote a very detailed tutorial.

The ELK stack

So, what is ELK? “ELK” is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a “stash” like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch.

With a search engine at heart, many security professionals started using Elasticsearch for logs and events in recent years and found that ELK easily ingests and visualizes them. And because the base functionality is free (yes, there are premium paid features like machine learning) it grew quite quickly in popularity and usage.

And while most ELK deployments are used in blue teams (the defenders), it’s also growing in popularity for red teams (the attackers). Marc Smeets and Mark Bergman from Outflank developed RedELK, a portable ELK stack, that helps spot the blue teamers and centralize operational logs for red operations. Read more about it here.

MDATP Streaming API

The Microsoft Defender ATP team just announced the (preview) release of their streaming API. This API and functionality can stream continuously the same schema MDATP supports in Advanced Hunting to external storage.

On a high-level approach, you need to bring up either an Azure Event Hub or configure Azure Storage, then configure MDATP to stream the data to the storage or event hub and then consume it.

Typical use cases would be an MSP secret sauce implementation, correlation with external signals, long term retention (for regulatory reasons), etcetera.

Connecting MDATP to ELK

To connect the Defender ATP data to ELK we’ll be going the event hubs way. In preparation, do the following:

Then run the following script:

param ([Parameter(Mandatory=$true)][string]$dataExportSettingsName,[Parameter(Mandatory=$true)][string]$eventHubResourceId,[Parameter(Mandatory=$true)][string]$eventHubName)#Install the ADAL.PS package if it’s not installed.if(!(Get-Package adal.ps)) { Install-Package -Name adal.ps }$authority = “https://login.windows.net/common/oauth2/authorize"$resourceUrl = “https://securitycenter.onmicrosoft.com/windowsatpservice"$clientId = “88cfeabb-510d-4c0d-8358–3d1929c8d828”$redirectUri = “https://portal.azure.com"$response = Get-ADALToken -Resource $resourceUrl -ClientId $clientId -RedirectUri $redirectUri -Authority $authority -PromptBehavior:Always$token = $response.AccessToken$url = “https://api.securitycenter.windows.com/api/dataExportSettings"$body =@{id = $dataExportSettingsName;eventHubProperties = @{ eventHubResourceId = $eventHubResourceId; name = $eventHubName };logs = @(@{ category = “AdvancedHunting-AlertEvents”; enabled = “true” },@{ category = “AdvancedHunting-MachineInfo”; enabled = “true” },@{ category = “AdvancedHunting-MachineNetworkInfo”; enabled = “true” },@{ category = “AdvancedHunting-ProcessCreationEvents”; enabled = “true” },@{ category = “AdvancedHunting-NetworkCommunicationEvents”; enabled = “true” },@{ category = “AdvancedHunting-FileCreationEvents”; enabled = “true” },@{ category = “AdvancedHunting-RegistryEvents”; enabled = “true” },@{ category = “AdvancedHunting-LogonEvents”; enabled = “true” },@{ category = “AdvancedHunting-ImageLoadEvents”; enabled = “true” },@{ category = “AdvancedHunting-MiscEvents”; enabled = “true” })}$headers = @{‘Content-Type’ = ‘application/json’Accept = ‘application/json’Authorization = “Bearer $token”}$response2 = Invoke-WebRequest -Method Post -Uri $url -Body ($body | ConvertTo-Json) -Headers $headers -ErrorAction Stopreturn $response2

The script will prompt you for three inputs:

  • DataExportSettingsName — a name you choose for this streaming api instance
  • eventHubResourceId — this is the identifier of your event hub (go to the event hub namespace page, properties tab and copy the resource id)
  • EventHubName — this is the name of your event hub

You’ll get a HTTP/200 OK if all succeeds, and data from MDATP should be flowing into the Event Hub in real-time by now. The schema of each message in the Event Hub is in the following JSON format:

{“records”: [{“time”: “ <The time WDATP received the event> ““tenantId”: “ <Your tenant ID> ““category”: “ <The Advanced Hunting table name with ‘AdvancedHunting-‘ prefix> ““properties”: { <WDATP Advanced Hunting event as JSON> }}]}

Next up is configuring Logstash to pull the information from the Event Hub, parse it, and write it into Elasticsearch. You will need to have the Azure Event Hubs plugin for Logstash installed, which will require you to also configure an Azure storage account. Then use this configuration:

logstash -e ‘input { azure_event_hubs { event_hub_connections => [“the_same_connecting_string_you_used_earlier_for_the_event_hub”] threads => 8 decorate_events => true consumer_group => “$Default” storage_connection => “connection_string_to_your_storage_account” }} filter {} output { elasticsearch { hosts => [“ip_address_of_your_elasticsearch_cluster:9200”] }}’

PRO TIP: Use the following URL to check if your logstash is up and running, and sending data: http://your_elasticseach_cluster:9200/_cat/indices?v

Update: Twitter user sentry_23 suggests that you add this filter for better searchability and data ingestion in Elasticsearch:

filter {
json {
source => “message”
}
split {
field => “records”
}
}

Hunting with MDATP data in ELK

Now that we have the data inside our ELK cluster, we can go hunting. If you’re not familiar with ELK yet, I suggest you check out their tutorial. Create some indexes, dashboards and all the other good stuff that Kibana has to offer:

PRO TIP: Want to understand what to look for in Defender ATP data in ELK? Microsoft published this article that lists all the available tables, along with their data types and descriptions that correlate to the Advanced hunting schema in MDATP.

Happy hunting!

— Maarten Goet, MVP & RD

--

--