Building a Detection Engine Part 1 — What is a Detection Engine?

Nathan Burns
7 min readJul 12, 2024

--

During my relatively short time as a Detection Engineer I’ve interacted with numerous Detection Engines and I’ve always wondered how these systems worked. Outside of a few open-source examples it’s a rather uncommon topic to discuss which was surprising to me given their importance.

Well.. I’m a firm believer that in order to truly understand something you need to get your hands dirty and try and build it yourself. Welcome to the first part of an ongoing series where I attempt to build my own custom Detection Engine.

What is a Detection Engine?

While the term “Detection Engine” has no widely-accepted definition, I’ve come to see it as a system that inputs data, analyzes it, makes a decision, and then outputs an artifact, say an alert or notification.

Figure 1: Detection Engine Overview

To expand further, a Detection Engine should also be able to:

  • Load internal and external rulesets that drive the Rule Engine.
  • Be highly performant, given how important time is when detecting a potential threat.

Every single Endpoint Detection and Response (EDR) tool uses Detection Engines to detect potential threats. Most will use a mix of Rule-based and more complex Machine Learning/Heuristics/Reputation based engines to alert on commodity malware families all the way to more advanced threats leveraging LOLBins and attempted EDR bypasses.

Heuristic-based Engine

Let’s take a look at how EDR tools leverage Detection Engines. Unfortunately, the internal workings of EDR tools are a closely guarded secret. As such, the following section is based solely on my working experience and my interpretation of how they may use Detection Engines.

One day you get tired of Adobe’s licensing scheme and decided to download a pirated version of Adobe Photoshop. Uh oh! It was infected with RedLine, a common information stealer malware strain. When you downloaded this executable the EDR agent analyzed the file as it was written to disk and determined if action should be taken. Depending on its risk level (was it downloaded from an unusual website, does it contain suspicious strings, etc), it may be executed in a sandbox to glean insights into its runtime behavior. During this analysis, it was discovered that this pirated version of Photoshop attempts to dump the Security Account Manager (SAM) registry hive, something that is extremely suspicious. As such, the EDR tool removes the pirated version of Photoshop and issues an alert to notify the SOC.

What just happened was an example of a Heuristics-based Detection Engine. The suspicious program was the input, instead of a Rule Engine as depicted in Figure 1, a Heuristics Engine was used to process the input, and the artifact was the alert issued to the SOC.

There are three different types of heuristic engines. Static, Dynamic, and Hybrid, which is a combination of the two. Static engines analyze software without executing it. This means analyzing file headers, strings, or disassembling the program to identify suspicious content. Dynamic engines actually execute the file, typically in a separate environment, and analyze it’s behavior to determine if it is suspicious or not. VirusTotal is an excellent example of a hybrid heuristic engine as it provides information on static features (located under the Details tab) such as portable executable information, headers, memory sections, imports, and more while also providing information on dynamic features (under the Behavior tab) such as network information, file system actions, registry actions, process interactions, and more.

Rule-based Engine

What happens if the EDR tool failed to detect the malicious executable when it was written to disk? When the executable was launched, other Detection Engines, such as a rule-based or ML-based would have been used. The latter is typically very complex, so I’ll be leaving it out of this initial blog post. Let’s jump into what a Rule-based engine does.

Every EDR tool contains Out Of The Box (OOTB) rules. These are traditionally known as signatures that the developer of the EDR tool have written to detect known threats. These rules are built into the tool and are available to all customers. Most also allow customers to create custom rules to aid in detecting scenarios that the OOTB rules failed to detect. For example, CrowdStrike Falcon enables the creation of RegEx-based Indicator of Attack (IOA) rules while SentinelOne allows the creation of SentinelOne Query Language (S1QL) based Storyline Active Response (STAR) rules.

If we re-examine our pirated Photoshop example from earlier, we’d hope the EDR tool contained a signature that was able to successfully detect this stealer. Given how expensive these tools are, it should be able to pick up commodity malware no problem, right? This isn’t always the case, and if our EDR tool of choice can’t detect this stealer, the SOC (or Detection Engineering team, if you have one) would be called to create a custom detection to alert on its behavior.

Note: If your over-six-figure EDR tool can’t detect RedLine with OOTB rules, it might be time to consider moving platforms.. but that’s neither here nor there.

Rule-based Engines, while simple to get up and running, do come with some important limitations:

  • Potential for False Positives: A Rule-based approach means the Detection Engine has no underlying context behind what just occurred. A legitimate security or backup related program may have been reading the SAM registry hive and our rule would have triggered, leading to a false positive. We’d need to either scope in our detection and raise the possibility of missing important events or scope out while maintaining an extensive list of known-good programs to exclude from our rule.
  • Difficult to Maintain: As a team continues to create more and more rules, it becomes increasingly difficult to manage a growing ruleset. How do you effectively review each rule for quality issues? What about testing the rule to ensure it works? This issue has become so prominent that an entire “as Code” phrase has been coined to describe a system that focuses on maintaining detections in a scalable way, “Detections as Code”. If you’re interested in how Detections as Code pipeline is created, I highly recommend checking out David French’s series “From Soup to Nuts: Building a Detections-as-Code Pipeline”.

On Host vs Off Host Engines

All of the examples up to this point have revolved around using an EDR tool. While this is a common means to implementing a Detection Engine, this isn’t the only way. Security Information and Event Management (SIEM) platforms can also be considered as Detection Engines, and in some cases may be more robust than EDR based ones.

Let’s take Splunk as an example. For those not familiar with Splunk, in the most basic of terms it’s a platform that allows you to send and index data for searching. It enables the creation of Scheduled Searches (and also more advanced Correlation Searches, if you use Splunk Enterprise Security) that run on a recurring basis and perform actions such as sending emails, posting a Slack message, etc if the search finds a match. This is another form of a Rule-based Detection Engine where the rule is the Scheduled Search, the input is the data being ingested into Splunk, and the artifact is the action configured to take place once the search finds a result.

Let’s call this type of Detection Engine “Off Host” as the processing is done off of the host machine and on the Splunk platform. Off Host Detection Engines come with a few notable differences compared to their “On Host” counterpart:

  • Time Penalty: As the processing is happening Off Host, there will naturally be some latency involved. Off Host engines will never be as fast as their On Host counterpart.
  • Lack of Prevention: As On Host engines are hooked in the Operating System, they can perform advanced actions that their Off Host counterpart can’t such as preventing the execution of a program. Off Host engines can only perform retroactive actions such as disabling an account or network isolating a device.

Regardless of these drawbacks, Off Hosts do provide some improvements over On Hosts:

  • Reduced Resource Utilization: While On Host Engines try to keep resource utilization to a minimum, even if it consumes only 5% CPU usage, if that agent is deployed across thousands of devices that impact can quickly add up. As the processing for Off Host engines is handled by an external server and not the client, there is vastly reduced performance impact. For a worse-case scenario, see a recent incident where the CrowdStrike Falcon sensor had an issue that caused it to reach 100% CPU utilization (don’t test in prod folks!).
  • Advanced Correlation: As Off Host engines have access to data from multiple sources they can correlate activities from multiple environments. Let’s say you wanted to automatically check the sign in logs for an identity after they executed a low-prevalence executable. An Off Host engine could correlate the SaaS-based sign in logs with the On Host device based logs.
  • Historical Analysis: Off Host engines store and process historical data, this allows for retrospective analysis for a potential threat and the detection of long-term adversaries that may not take quick, successive, actions.

Off Host and On Host engines, while different, complement each other nicely. On Host rules can be reserved for the most critical rules that must be able to prevent matched activity while Off Host can run more resource intensive rules that correlate activity between multiple sources.

Final Thoughts

To sum it all up, a Detection Engine is a system that takes data as input, processes the data, makes a decision, and then outputs an artifact if the processing finds a match or a threshold is reached. We covered two types of Detection Engines, Rule-based and Heuristics-based. Rule-based utilizes signatures to detect known threats while Heuristic-based use the activity exhibited by a process to detect suspicious events. Detection Engines can either be On Host, meaning the processing is done on the client or Off Host, where the processing happens on an external server.

In the next part we’ll start diving into creating our Detection Engine. We’ll be outlining what tools we’ll use to get it up and running, what type of engine we’ll initially create, and diagram out what we’ll be building. If we’re lucky we may even start building it!

As always, thanks for reading! If you have any questions, comments, or concerns feel free to reach out to me on Twitter, LinkedIn, or leave a comment on this post!

References

  1. Koret, J., & Bachaalany, E. (n.d.). The antivirus Hacker’s handbook. O’Reilly Online Learning. https://learning.oreilly.com/library/view/the-antivirus-hackers/9781119028758/c09.xhtml#:-:text=Heuristic%20Engine%20Types

--

--

Nathan Burns

Detection Engineer interested in analyzing cloud-based attacks, sharing knowledge, and developing custom solutions to hard problems.