Tracking Detection Drift

Series: Measuring the effectiveness of a detection engineering program

Gary Katz
10 min readJul 19, 2023

This blog series is based upon excerpts from a book I am writing with Megan Roddie and Jason Deyalsingh.

This blog post will provide a way to automatically identify a subset of false negatives by introducing a concept called detection drift.

Detection drift occurs when a set of detections, designed to identify an adversary or TTP begin to be less effective as the adversary adapts their procedures. False negatives are when detections did not create an alert when they should have. Identifying when this happens and by how much, allows an organization to track the quality of their detections related to a specific adversary or TTP. It also allows an organization to identify where those detections are now failing and patch the holes in their detection strategy. The below approach allows an organization to create metrics that track this drift and proactively patch their detections as adversaries begin to evade them.

To track detection drift, we need a way to identify that the adversary has evaded a subset of our detections. In general, cyber security uses a defense-in-depth approach to stopping an adversary. Multiple layers of detections or other defenses (such as restricting access control, reducing the attack surface, patch management, etc.) force the adversary to evade multiple layers of defenses to achieve their objective. By employing multiple detection strategies, we raise the cost of an attack. An adversary must change all parts of their tooling and infrastructure, or risk being identified. This was first widely documented in Lockheed Martin’s Intelligence Driven Defense paper which outlined the Cyber Kill Chain. The paper also demonstrated how to use the kill chain to forensically analyze an attack, identifying earlier stages of the kill chain and artifacts that could be used to identify later stages of activity, as shown below. By creating or updating detections, a SOC could continue to detect an adversary who had only partially changed their tools and infrastructure.

The kill chain analysis can be used to evaluate the defensive performance in moving horizontally across the MITRE ATT&CK matrix. We can also evaluate our drift vertically across multiple layers of detections for the same technique. A single technique could be detected by multiple detections at one or more stages as shown in the pyramid of pain. For example, we may have network signatures identifying the C2 protocol of a malware family, file signature identifying the tooling and behavioral rules to identify the activity performed with those tools.

Whether analyzing the procedures used by an adversary across the phases of a kill chain, or the abstraction layers of a single capability, it is important to remember that this analysis has occurred at a point in time. As an adversary attempts to evolve their procedures to evade defenders, not all detections continue to be effective. Ideally, we’d like a way to track not just what TTP a rule is detecting, but also how many alerts should fire together. This would track the continued effectiveness of our detections and prioritize future work for the detection engineering team. Detection drift is the difference between the detections that should have detected the activity and what did detect the activity.

This is not as simple as it seems. Despite creating multiple detections based upon our understanding of the adversary or technique, we would not expect all detections to hit each time. For example, if we identified two tools that are used by adversaries when performing a particular technique, we could write yara signatures to identify those tools in the future. We would only expect one of these tools used in an attack and thus only one signature to hit if the adversary deployed this technique. If the adversary used a new tool though, which did not hit against either signature, we would have a gap in coverage and our detections have drifted from current adversary methods. This coverage gap might be difficult to detect. Suppose for example, the adversary’s new tool continued to use a shared library employed by tool number 2, which we also had built a detection for. The SOC analyst would still receive an alert for the shared library but might not recognize that a second expected detection at the tool level did not occur, requiring an update by the detection engineer.

Figure 1: In the above example, Detection 1 should only alert on its own, while Detection 2 and 3 should always alert together. Detection 3 alerting without Detection 2 shows a detection drift

These variations can be broken into three categories, visualized in the figure below. Either a group has its own detection for that category, A detection spans across multiple groups or no detection exists for a subset of groups. A group can be defined based upon your needs. Each procedure for a technique can be its own group, or a common or necessary attack path taken by an adversary identified using the kill chain analysis. A group is just a representation for the set of detections that should fire together where you want to track detection drift.

During the Investigate phase of the detection engineering lifecycle we can map capabilities to their associated groups. An abstraction map, introduced in the SpecterOps’ Capability Abstraction blog is a tool that can be used to track these relationships and will help demonstrate how we can track detection drift. The blog is linked at the bottom of this article and is an excellent way to understand abstraction maps. We will provide a short explanation of the concept below, but the blog reviews a detailed example if you are interested.

The abstraction map, an example of which is shown in the figure below, works by breaking the capability into a subset of groups and peeling back the capability to identify different layers of abstraction. As each layer is investigated, the associated artifacts are captured and mapped across the groups that they apply to. These artifacts can then be used to create the associated detections. An artifact detail can be missing because it is not applicable or because the investigation did not uncover that detail or was incomplete.

Figure 2: Abstraction map with 5 layers

The abstraction map, an example of which is shown in Figure 2, works by breaking the capability into a subset of groups and peeling back the capability to identify different layers of abstraction. As each layer is investigated, the associated artifacts are captured and mapped across the groups that they apply to. These artifacts can then be used to create the associated detections. An artifact detail can be missing because it is not applicable or because the investigation did not uncover that detail or was incomplete. A generalized version of the abstraction map is shown below in Figure 3.

Figure 3: A generalized abstraction map

An abstraction map can be useful in the investigation phase our detection engineering lifecycle. For identifying detection drift, i.e. false negatives, we are using it to easily determine the minimum number of detections that should fire under different circumstances. In the example above, if a detection was created for each artifact detail, the abstraction map can be broken into 12 detections with anywhere from 5 to 3 detections alerting based upon the attacker’s approach. Not only that but for the same detection, such as Network Protocol 1, in some circumstances there are 4 other detections that would alert and in other cases only 2 other detections would alert.

To calculate the detection drift, we can add a tag to each of these detections with the minimum detections (MD) that should occur with that detection for the associated Capability Grouping (in this example the capability grouping is Technique T1234). When any new alerting occurs related to one or more of these detections, we can use these tags to calculate the detection drift. An example of an abstraction map with tags applied is shown in the figure below.

Figure 4: Example of detection tags

To calculate the minimum detection drift we take the Maximum of the minimum detection tags for the associated alerts. The formula for this is shown below.

Formula for calculating the minimum detection drift

Example 1:

This first example assumes that all detections triggered at all abstraction layers for Group 1.

Table 1: Group 1 without detection drift

Taking the minimum detection count in Table 1, we can calculate detection drift using the previously-defined formula.

In this most basic example, there are a maximum of 5 potential detections and all 5 occurred, resulting in zero detection drift.

Example 1 with Drift:

To observe what detection drift looks like, let’s assume that the RPC 1 and Win API Function 1 detections did not fire, resulting in the modified table shown in Table 2.

Table 2: Group 1 with detection drift

Re-performing the calculation identifies that a minimum of two detections did not result in alerts, revealing how detection drift shows potential gaps in detection.

Example 2:

In example 2 we will use Group 4 from the abstraction map, which contains a gap in coverage in the Registry Layer.

Example 2 without Drift:

Table 3 shows the detections that did fire and the associated tags in Group 4 without coverage for the Registry Layer.

Table 3: Group 4 without detection drift

Once again, we observe no detection drift, meaning that our detections cover all possible layers of abstraction of the capability group. Next, let’s see what happens if one of the above detection doesn’t fire.

Example 2 with Drift:

Table 4 assumes that the same Group 4 capabilities are being analyzed, but this time the Network Protocol 2 detection does not trigger.

Table 4: Group 4 with detection drift

In this second example, we don’t know if 2 detections failed or if there were more, but we do know that something changed in the adversary behavior that caused at least a subset of our detections that should have hit to not fire.

In summary, this section demonstrates how the concept of detection drift can highlight how and where changes in adversary behavior can affect the coverage of your detections.

Adapting to account for automated response

One complexity with this approach is that many detection devices now have the ability to also respond to an attack in addition to detecting it. In these circumstances detections occurring after the response would occur need to be accounted for.

This is similar to the modified approach that would support kill chain analysis. In both circumstances, detections that occurred after the attack is expected to be stopped would not be included within the minimum detection count.

In the abstraction map below, we assume that an EDR would stop the attack prior to seeing network activity. Notice the updated MD values for detections occurring prior to the EDR response and that the value remains unchanged for the network protocol detection.

Figure 5: Tagged abstraction map with EDR Response

Tracking detection drift is one approach for identifying when an organization’s detections are becoming less effective. These calculations provide the ability for an organization to repair their defenses by automatically identifying changes in adversary behavior that are no longer detected. There are several limitations in implementing detection drift that should be recognized.

1. Many detection engineers will include multiple artifacts within the same detection rule. If those artifacts have different minimum detection scores, it becomes difficult to automatically output the MD value from within the alert.

1. Mapping detections to an abstraction map or similar matrix is a time-consuming effort. This limits the ability to calculate detection drift to high impact techniques or attackers where such analysis is justified.

2. Not all alerts are sent to the SOC at the same instant, therefore this calculation cannot be performed accurately in real-time. Instead, it must be performed once all alerts have arrived. This can be accomplished either as a scheduled or on demand query against the alert data.

3. A security device may send multiple alerts for the same detection. When building your query make sure that the detection ids are distinct.

4. The calculation does not distinguish between detections that are easy for the adversary to adjust around (such as a hash) and detections that are more difficult to avoid. If the detections are based upon artifacts lower in the pyramid of pain that are easy for the adversary to change, detection drift may occur too commonly to action and you may decide to purposefully ignore those artifacts from the calculation.

Conclusion:

Tracking and calculating detection drift allows an organization to identify when their detections are becoming less effective. These calculations provide the ability for an organization to repair their defenses by automatically identifying changes in adversary behavior to avoid them. While the calculations can be automated, there are limitations to this approach. The analysis to tag the detections can be time consuming but most of the time is spent as part of the investigative process that the detection engineer should already be performing. I am interested in other ways to perform similar analysis or improve this method so if you have ideas, please reach out.

Further Reading:

If you want to truly understand abstraction maps, the below article provides a full walk through example.

--

--

Gary Katz

A mix of software architecture, cyber security, coffee and cocktails