ATT&CK Evaluations Site Update: Round 2 Methodology and Technique Comparison Tool

Frank Duff
May 30, 2019 · 5 min read

In our continued effort to evolve ATT&CK Evaluations, we are happy to share details about Round 2 as well as a new technique comparison tool that we added in our latest site update.

Round 2 Methodology

First, we have released additional information about how we are performing Round 2 (APT29 emulation), including new detection categories and technique scope. These details will ensure all vendors are equipped with the same information as they make decisions on whether or not to participate. Additionally, they provide end-users a preview of what to expect of the Round 2 results.

Detection Categories

Detection categories are a mechanism to normalize detections across vendors despite differences in detection approaches and user interfaces. While we hope the Round 1 detection categories enabled easier discussion across vendors, we understand there was some confusion on these categories, including about different types of delays, “None” categories that contained notes, Tainted, and Specific Behavior vs. Enrichment, among other topics. Recognizing the need to clarify these points, we sought to take community feedback to learn from our Round 1 detection categories, focus on the core intentions of each category, and change the categories to more clearly articulate the differences in detections.

We have defined a new hierarchy of detection categories, where detections provide more detailed information about actions performed as you move across the categories from left to right.

There are a couple key points we want to highlight about these categories (full definitions are available here):

  • Tactic and Technique: To better convey the types of behavior-based detection beyond the General Behavior or Specific Behavior categories used in Round 1, we have created the Tactic and Technique categories. These categories relate the type and level of information provided to an analyst in a way that’s more relatable to ATT&CK. Tactic detections address the adversary’s potential intent, while Technique detections address how that behavior was performed.
  • General: This category will cover non-specific behavior detections such as “suspicious action performed” that do not provide additional detail. This category is defined differently from the category known as General Behavior in round 1.
  • Telemetry: We have maintained this category from our Round 1 categories.
  • MSSP: We have created a distinct category for managed security service providers (MSSP) given the unique way a detection is performed with humans in the loop.
  • None: The None category now has two modifiers: one for residual artifacts, and another for host interrogation. We added these modifiers because, while they don’t qualify as detections in the context of our evaluations, both provide useful information about capabilities that normally require additional end-user analysis to determine maliciousness.
  • Delayed and Configuration Change: We have further broken out Delayed and Configuration Change to create greater distinction between the types.
  • Correlated: We have renamed the Round 1 modifier known as Tainted to Correlated to better reflect the intent of the detection type.
  • Alert: We have created an Alert modifier category to better separate the context of a detection from how the information is presented to a user.
  • Innovative: We have created a new modifier category called Innovative to highlight accurate and robust approaches that bring value and deeper insight to consumers based on the Evaluation Team’s judgment. Not all techniques will have an Innovative designation, but we added this because we felt it was important for us to have the ability to highlight detections we feel go “above and beyond.”

With these changes, we hope our results will better capture the unique abilities each of these vendor solutions provide. You will note in our graphic that we have ordered detections based on level of information increasing as they progress from left to right. We feel this is important to capture because some detection types contain additional context, and that matters to end-users. Though we are ordering detection categories, we still do not plan to provide scores, since we still believe that what is “most valuable” depends on each organization’s individual needs.

Technique Scope

As we move from an emulation based on APT3 in Round 1 to an emulation based on APT29 in Round 2, we have a new scope of techniques for the evaluations. The APT29 emulation will cover 58 Enterprise Windows ATT&CK techniques across 10 ATT&CK tactics, with an emphasis on custom code and alternate execution methods such as PowerShell and WMI. We are excited to see how tools stack up against these changes in emulated behaviors. Round 2 will use custom implementations of both common and less-common technique procedures to test the flexibility and depth of detection capabilities, all “in the style of” APT29. For more information about the Round 2 (APT29) emulation please see the Round 2 Methodology Overview.

Technique Comparison Tool

In addition to the Round 2 updates, we have also released our first tool to help users better explore our data, the Technique Comparison Tool. Since the release of our initial results in November 2018, we have received feedback that our Evaluations provide a useful starting point to enable users to make decisions on what tools to buy, but they require substantial expertise to make use of the data. We maintain our belief that there is no single solution that fits all use cases, and each of these tools deserves consideration against your needs. In response to this feedback, we wanted to release a tool that will empower users to more effectively explore our data.

The Technique Comparison Tool allows you to select a procedure in our Operational Flow and see the results for all vendors for a given technique. This will allow the user to look across all the vendor evaluations to see how they compare across detection types, but also obtain the critical details the notes and screenshots provide. We see this as a way to allow end-users to do a comparison based on their specific needs and requirements.

The End of Round 1 is Near

In our Round 2, announcement we announced the end of rolling admissions for Round 1. Over the next several months, we will announce and update our website with results from vendors who were were already at various points of our process before the close of Round 1.

Round 2’s Call for Participation closes on July 31, 2019, and we will be launching into those evaluations shortly after that. As we previously noted, Round 2 evaluations will all be released at once rather than in the “rolling admissions” format used in Round 1.

We look forward to continuing to enhance our evaluations and website with more data and tools to help organizations make critical decisions on what tools to buy, understand how to use these capabilities more effectively, identify methodologies to assess tools and environments, and push the industry to improve. As always, if you have feedback or ideas on how we can do this better, please contact us at

©2019 The MITRE Corporation. ALL RIGHTS RESERVED Approved for public release. Distribution unlimited 18–03621–14.


This is the official blog for MITRE ATT&CK®, the…


This is the official blog for MITRE ATT&CK®, the MITRE-developed, globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. The full website is located at

Frank Duff

Written by

Frank Duff (@FrankDuff) is the Director of ATT&CK Evaluations for MITRE Engenuity, providing open and transparent evaluation methodologies and results.


This is the official blog for MITRE ATT&CK®, the MITRE-developed, globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. The full website is located at