Light Roast 111: XDR — Hype vs. Reality

An overview of security controls for endpoints and infrastructure, and a comparison between SIEM and XDR technologies.

Published in

Dark Roast Security

12 min readFeb 2, 2022

Every now and then, I am approached by coworkers or clients asking questions about the ever-expanding list of technical security controls: EPP, EDR, NTA, HIPS, HIDS, UBA, UEBA, SIEM, SASE, etc.

I usually manage to navigate my way through the maze fairly easily, giving them a rundown of each category: delving deeper into the evolution and convergence of EPP and EDR tools, explaining where SIEM and MDR fit in, and speculating about the potential impact of SASE technology on the future of firewalls and Internet security. But when it gets to XDR, things get a bit weird.

Is it really a groundbreaking approach to detection and response, big enough to have its own acronym? Or is it where log aggregation and SIEM technology have already been going for more than a decade? Do the alerts from different log sources auto-magically correlate and bundle themselves into incidents? Or does it still require a team of human operators doing the bulk of that work manually? Is it really the best thing since sliced bread, a natural evolution of EPP/EDR, or is it just a smart marketing move?

To answer these questions, we should start with a quick review of the evolution of detection and response tools across endpoints and IT infrastructure.

Endpoint Security: From Antivirus to EPP and EDR

I will spare you the story of self-replicating programs and the origins of computer viruses going all the way back to the ’70s; you can find a great overview of that on Wikipedia [1].

Let’s fast forward to the ’80s when the first Antivirus programs were created. These were the good old Dr. Solomon’s Antivirus Toolkit, Norton Antivirus, McAfee Antivirus, and all the other security programs that some of us grew up with back in the late ’80s and early ’90s.

Their main job was to scan your whopping 10GB hard drive for malware, and they would do this by comparing files against a local database of known-bad hashes, or by looking for certain patterns and strings inside these files, similar to the way we search for IOCs in SIEM logs nowadays. Later on, some vendors added heuristic analysis as a way to detect new variants of viruses without relying on exact signatures and patterns [2].

Over time, the word malware replaced virus, becoming an umbrella term covering different categories of malicious programs: viruses, rootkits, adware, spyware, potentially unwanted programs (PUP), and worms.

Around the same time, Antivirus vendors started adding some additional features such as Host Firewall, Device Control, and Web Traffic Filtering. They moved away from the rather narrow and specific term Antivirus, to the broader and more aptly named label Endpoint Protection (EPP).

Then, sometime around 2016, a company called Cylance shook the industry by releasing a fully AI-driven antivirus solution. There is still some debate on Cylance’s testing methodology [3], nevertheless, this move propelled the world toward NextGen Endpoint Protection, and most legacy EPP vendors quickly followed suit by adding ML functionalities of their own. So from a business and consumer perspective, “behold, it was very good”.

Then came startups like CarbonBlack and CrowdStrike, with the valid claim that many malware-less attacks can go right past your EPP solution if you’re not looking at process-related suspicious activity. Examples of these are Living Off the Land techniques such as malicious Powershell scripts, strange process chains, executables running from the Recycle Bin, etc. This led to the birth of Endpoint Detection and Response (EDR) technology.

The core idea behind EDR was that there is a grey area where categorically blocking suspicious activity could wreak havoc on your systems, and you’d end up with very unhappy developers and sysadmins. EDR’s main goal was to detect the suspicious activity and leave the investigation, isolation, and remediation piece to your engineering or SOC team.

Over the next few years, EDR makers added more EPP features such as static analysis, signature scans, device control, and firewall management, while EPP vendors began to build more EDR-like tools into their products. This further blurred the lines between EPP and EDR [4]. Today, most endpoint protection startups design products with the full EPP and EDR spectrum from the get-go.

Network Security: Firewalls, IPS, IDS, and NTA

Now let’s shift our focus away from endpoints and look at network infrastructure.

Until recently, most employees used to work in brick-and-mortar offices, using workstations plugged into the corporate network, with all their traffic traversing through the perimeter firewalls.

Traditional firewalls couldn’t see past layer four, which means that you had no control over ingress or egress traffic beyond IP addresses and port numbers. Technologies like deep-packet-inspection and SSL decryption were expensive, both computationally and financially, so larger organizations would rely on dedicated inline Network Intrusion Prevention Systems (NIPS) or out-of-band Network Intrusion Detection Systems (NIDS) hardware appliances.

The problem with most of these technologies was that they required physical connectivity to every firewall or core switch in order to capture all the traffic. There were also other problems associated with port mirroring: oversaturation resulting in dropped frames [5], and inherent limitations of TCP RESET as a way of stopping traffic flow [6].

Over time, emerging firewall vendors like Palo Alto managed to successfully converge the NIPS feature set into their devices, creating what we know as NextGen Firewalls. Most other firewall vendors eventually caught up either by developing the technology in-house or acquiring NIPS/NIDS companies and integrating them into their products. The advent of NextGen Firewalls and the increase in remote work and cloud applications essentially brought on an end to the dedicated, hardware-based NIDS/NIPS solutions.

While firewall makers were busy making improvements to their line of products, a number of security vendors came up with Network Traffic Analysis (NTA) solutions to address the challenges associated with SPAN ports and inline IPS tools.

Instead of looking at real frames and packets, NTA technology would rely on traffic metadata to detect anomalous activity. This was mainly achieved using Netflow, ERSPAN, and Syslog collected remotely from network equipment. The product would then apply static rules, IOCs, or anomaly detection ML jobs to alert the SOC of suspicious activity.

SIEM: The Elusive Single Pane of Glass

The concept of log aggregation, parsing, and analysis is not new in IT. There were numerous Unix/Linux utilities and command-line tools that could be jerry-rigged to achieve this goal (e.g. rsyslog, syslog-ng, logstash, swatch, grep and zgrep, logrotate, etc.), however, a reliable, scalable, and elegant implementation of these concepts didn’t become a reality until late 2000’s when the first SIEM products were released to the market [7]. It is worth mentioning that those *nix tools are still alive and heavily used for ingesting logs into modern SIEM platforms.

The idea here was that you could ship logs from operating systems, applications, appliances, and all the tools we discussed above (EPP, EDR, NTA, etc.), pass them through a set of ingest pipelines that parse, enrich, and de-normalize the data, then store them in a central repository where they can be searched against and retained for a specified period of time.

Once the data got into your SIEM, you could use a combination of static rules or ML jobs to look for anomalous activity on your endpoints, applications, network devices, e-mail systems, and so on. Consequently, your SOC team was no longer bombarded by e-mail alerts from a plethora of different technologies or having to log in to dozens of different dashboards to conduct investigations. They would have only one place to go: the almighty SIEM!

First-generation SIEM solutions were more focused on log retention and satisfying compliance requirements. Anomaly detection was almost an afterthought, falling behind some of the more sophisticated methods used by EDR vendors. However, times have changed, and modern SIEM solutions offer hundreds of pre-built detection rules, dozens of dashboards, correlation functionality, and machine learning jobs.

Furthermore, you can install EDR-like agents on your endpoints to capture not only event logs, but also process and file information using tools such as Sysmon or Auditd. You can also leverage NTA functionality by ingesting Syslog and NetFlow, or even mirroring your switch ports to a host running Zeek(bro).

For hosted applications and cloud platforms, you can simply make API calls to pull the latest data, with authentication and pagination and all. My area of expertise is mostly in BELK (Beats, Elasticsearch, Logstash, Kibana), but I know these are all possible on Splunk and other major SIEM solutions, so it’s important to highlight this when talking about the first generations of SIEM versus where we are now in the 2020s.

Some organizations have taken this a step further and automated their incident response by integrating their SIEM into a SOAR (Security Orchestration, Automation, and Response) tool. These tools simplify and automate certain aspects of SOC response, such as using SIEM alerts to blacklist an attacker’s IP address on the perimeter firewall, create a DNS sinkhole, or isolate an endpoint.

So What Is XDR?

The term eXtended Detection and Response (XDR) is relatively new. It was coined by Nir Zuk of Palo Alto Networks in a keynote speech back in 2018 [8]. The easiest way for me to explain this is by describing a hypothetical dialogue between our favorite security experts, Bob and Alice.

Alice — So what is an XDR?
Bob — It’s like an EDR but it also monitors your network gear and cloud platforms.
Alice — So it’s kind of like a SIEM?
Bob — Kind of… but SIEM doesn’t really focus on endpoints, it doesn’t care about running processes or file paths, it just ingests event logs, syslog, etc.
Alice — Depends on the SIEM though… Most of the modern ones like Splunk, Elastic, and AlienVault have endpoint agents that ingest event logs, syslog, process info, network traffic, and file integrity data, either using built-in modules or by leveraging things like Sysmon, Auditd, osquery, etc.
Bob — True, but they don’t have the endpoint protection component, so it’s all detection only. With XDR, it’s detection AND prevention.
Alice — Yeah but SIEM products have evolved in that area too… so it’s just a matter of time…
Bob — Yes, but they’re not quite there yet. Also, SIEM is better suited for compliance, not detection and response, whereas XDR is better at detecting suspicious activity using ML rules.
Alice — That’s really not the case anymore. Most modern SIEM tools have an extensive library of static detection rules and ML jobs these days.
Bob — What about event correlation? How can you connect suspicious activity by a user or IP address across multiple log sources?
Alice — You can use things like Splunk Common Information Model or Elastic Common Schema (ECS). That way you can map vendor-specific fields to generic ones like user.name, host.domain, etc., and run correlation across all of them. It takes a bit of work, but it’s doable.
Bob —Yeah, but that takes time. Also, SIEM is mostly on-prem, XDR is a SaaS service.
Alice — AlienVault is SaaS, then there’s Splunk Cloud and Elastic Cloud, QRadar has a cloud offering too.
Bob — …
Alice — So what really is the difference between SIEM and XDR?

To answer that question, let’s start with Gartner’s definition of XDR:

“a SaaS-based, vendor-specific, security threat detection and incident response tool that natively integrates multiple security products into a cohesive security operations system that unifies all licensed components.”

Sounds familiar? If you remove “vendor-specific” and “licensed” from the description above, the rest of that statement describes a subset of what any SIEM tool already provides out of the box.

Does any of that really warrant coming up with yet another acronym, making everyone feel even more confused and miserable?

To further clarify things for myself, I started breaking down the main components of each toolset in order to see how much overlap we’re dealing with. I have used a bold font for core components, and italic for the additional features I’ve seen across different products.

A quick glance at the table above shows that SIEM and XDR do in fact have a good amount of overlap. Does that make XDR and SIEM the same? No, at least not yet.

Even though the two categories are very similar, XDR is still closer to the endpoint since it has its roots in EPP and EDR technology. If you look at the brochures and marketing information of XDR products, almost all of them have a mature antivirus/EDR component, but they can integrate with only a handful of infrastructure and cloud platforms. That’s nothing compared to the plethora of integration modules offered by the likes of Splunk, Qradar, and Elastic.

That being said, can you really shame EDR vendors for having a dream? Can you blame them for coming up with a new name that is sexier than SIEM, a term that has sadly become synonymous with “complex” and “costly”?

Probably not.

But do you think an EPP/EDR company can build an open, multi-tenant platform that can scale to hundreds of nodes, store terabytes of data in hot, cold, and frozen tiers, and be able to run a search for an IOC in your Apache logs going back a year and tell you, within seconds, whether you were ever hit by a Zero-day Webshell attack that’s been just discovered today?

Can they build something like Logstash that can ingest from Syslog and JDBC to IMAP and Twitter, and parse anything from CEF and CSV to JSON and Java stack trace, even Joe’s loosey-goosey multiline custom written application log that uses a weird timestamp, and is generously seasoned with double quotes and special characters?

I doubt it.

And it’s not just the EPP/EDR makers trying to take things to the next level. SIEM vendors have been busy too! They have been moving in the opposite direction, trying to penetrate deeper into the Endpoint market. A prime example of this is Elastic.

They started with detect-only Beats for things like logs, processes, and network connections, but have recently entered the endpoint protection market by acquiring EndGame and releasing Elastic Agent [9]. But would a SIEM vendor be able to detect a hypervisor rootkit or something that has injected itself into your MBR or UEFI partition? Probably not…but who knows!

My takeaway from this research was the following:

The question of whether XDR is a marketing trick or a unique category in the cybersecurity toolset is irrelevant at this point. It is being quickly adopted by the industry as the next best thing, regardless of all the overlaps with SIEM technology.
At least for now, SIEM is still the best bet for larger enterprises or mature MSSPs that have hooks into hundreds of different technologies, manage many heterogeneous environments, and require heavily customized ingestion pipelines, alerting, and dashboards. For small to mid-size organizations with a more limited set of security tools and no dedicated or outsourced security staff, XDR would be the way to go.

Like it or not, the term XDR is here to stay, and it could potentially replace SIEM in the future. In fact, I am already coming across it in the marketing materials of some SIEM vendors. What remains to be seen is whether the XDR products can actually deliver on the promise of painless integration, low noise, better correlation, and more accurate ML jobs.

The industry has talked the talk and invented a new name, let’s see if they can walk the walk.

References

[1] https://en.wikipedia.org/wiki/Computer_virus, ‘Computer Virus’
[2] https://cs.stanford.edu/people/eroberts/cs181/projects/2000-01/viruses/anti-virus.html, ‘How Anti-Virus Software Works’
[3] https://arstechnica.com/information-technology/2017/04/the-mystery-of-the-malware-that-wasnt/ ‘Lawyers, malware, and money: The antivirus market’s nasty fight over Cylance’
[4] https://www.gartner.com/imagesrv/media-products/pdf/symantec/symantec-1-4SNI36O.pdf?es_p=6816496, ‘The Evolution of Endpoint Protection’
[5] https://www.garlandtechnology.com/blog/stop-misusing-span-ports-or-risk-losing-network-traffic-data, ‘Stop Misusing SPAN Ports Or Risk Losing Network Traffic Data’
[6] https://www.computerworld.com/article/2521502/tcp-reset--pros-and-cons.html, ‘TCP RESET: Pros and Cons’
[7] https://cybersecurity-magazine.com/a-brief-history-of-siem/, ‘A Brief History of SIEM’
[8] https://www.stratospherenetworks.com/blog/what-is-xdr-your-guide-to-extended-detection-and-response/, ‘What Is XDR? Your Guide to Extended Detection and Response’
[9] https://www.elastic.co/blog/whats-new-elastic-security-7-16-0 ‘Elastic Security 7.16: Accelerate SecOps with the most powerful Elastic Security yet’