Your Complete Introductory Guide to Understanding the MITRE Engenuity ATT&CK Evaluation Results

CyCraft Technology Corp
May 12 · 13 min read

In this post, we explain what the evaluations are, who’s running them, why the evaluations are important, and what’s new in this year’s evaluation results.


What are the MITRE Engenuity ATT&CK Evaluations and Why Do They Matter?

In 2018, the MITRE Corporation launched the MITRE ATT&CK Evaluations, where MITRE evaluates the efficacy of cybersecurity products using an open methodology based on their own publicly available ATT&CK (Adversarial Tactics, Techniques & Common Knowledge) Framework — a living, growing framework of common tactics, techniques, and procedures (TTP) used by advanced persistent threats (APTs) and other cybercriminals. Everything a hacker can do on a victim’s system can be uniquely represented in the ATT&CK Framework.

The ATT&CK Evaluations are extremely useful to end users of cybersecurity solutions as it provides transparency and publicly available data to the true efficacy of some of the leading cybersecurity products in the world.

ATT&CK Evaluation results also provide screenshots of cybersecurity solutions at work, granularly detail what is happening in each screenshot, and provide insight into a cybersecurity solution’s approach to security.

Each year (or “round”) of the ATT&CK Evaluations has cybersecurity vendors pitting their solutions against MITRE team-created emulations of known APTs (whose names somehow get progressively cooler with each round).

Round 1 (2018) Emulation — APT3
Round 2 (2019) Emulation — APT29
Round 3 (2020) Emulation — FIN7 & Carbanak
Round 4 (2021) Emulation — Wizard Spider & Sandworm

As of November 2019, MITRE Engenuity began running the ATT&CK Evaluations, hence its renaming to the MITRE Engenuity ATT&CK Evaluations. The ATT&CK Evaluations use terminology from the MITRE ATT&CK framework.

Wait. Who are MITRE and MITRE Engenuity?

If you’re looking into the results of an evaluation, it’s important to know exactly who is behind those evaluations.

The MITRE Corporation is a not-for-profit organization that operates federally funded research and development centers (FFRDCs) to assist the United States government with scientific research, development, and systems engineering.

MITRE was formed in 1958, at the height of the Cold War between the U.S. and the U.S.S.R., to provide guidance over the construction of the U.S. Air Force Semi-Automatic Ground Environment (SAGE) air defense system, which would later direct the North American Air Defense Command (NORAD) response to an air attack from the Soviet Union.

For further insight into the MITRE Corporation,
read our blog article: Who is MITRE?

In November 2019, MITRE launched MITRE Engenuity, a tech foundation dedicated to collaborating with the private sector on specific challenges, including critical infrastructure, cybersecurity, and the ATT&CK Evaluations — henceforth known as the MITRE Engenuity ATT&CK Evaluations.

“MITRE has a history of transforming cybersecurity standards, improving aviation safety, and advancing healthcare analytics through our operation of federal research and development centers. Through MITRE Engenuity, we’re now applying our interdisciplinary expertise and resources to work with industry on complex, public interest challenges so that we can have greater impact on shaping the future of [America’s] critical infrastructure.”
Jason Providakes, MITRE President and CEO

Jason Providakes, MITRE President and CEO (Source:

So What is the MITRE ATT&CK Framework?

In cybersecurity, there have been several approaches used to track and analyze the various characteristics of cyberattacks. The MITRE ATT&CK (Adversarial Tactics, Techniques & Common Knowledge) framework is the latest model and has been rapidly adopted by the cybersecurity community since its initial release in 2015.

Observed real-world hacker behavior is now granularly documented within the ATT&CK framework — an attempt at a living document collecting threat intelligence from defenders and analysts from across the globe. The ATT&CK framework is meant to be a living, growing framework of common tactics, techniques, and procedures (TTP) used by advanced persistent threats (APTs) and other cybercriminals.

MITRE describes its framework as “a globally accessible knowledge base of adversary tactics and techniques based on real-world observations. The ATT&CK knowledge base is used as a foundation for the development of specific threat models and methodologies in the private sector, in government, and in the cybersecurity product and service community.”

The ATT&CK framework consists of three different matrices: Enterprise, Mobile, and Industrial Control Systems (ICS). The Enterprise framework is the most commonly referenced matrix.

The ATT&CK Matrix is composed of tactics, techniques, and procedures (TTP). Following the 12 columns, or tactics, from left to right, are another take on steps an attacker would typically follow when attacking your organization.

Each column is made of multiple techniques. Indeed, multiple techniques can be employed to accomplish the same tactic, and depending on the attacker’s main objective, not all tactics need to be employed. The aggregate of techniques used during an attack is known as the behavior profile — the procedure the attacker followed to accomplish their ultimate objective.

For further in-depth reading on the MITRE ATT&CK framework, how it differs from other intrusion analysis models, and the ATT&CK framework’s other applications, consult our previous blog article: MITRE ATT&CK vs. Cyber Kill Chain vs. Diamond Model.

Okay, But How Does the MITRE ATT&CK Framework Help Defenders?

As 21st-century cyberattacks continue to become more and more sophisticated, the monumental impact and usefulness of the MITRE ATT&CK Framework cannot be overstated enough.

The ATT&CK Framework creates a publicly available reference library for cyber intrusions made through real-world observations, standardizes cybersecurity terminology across vendors and end users, granularly maps known attacker behavior, allows defenders to efficiently map and tune defenses, and also provides data-driven intelligence to allow for more accurate APT attack emulations.

As an example, 65 techniques across 11 tactics were used in the third round of ATT&CK Evaluations for the emulations of Carbanak and FIN7. Below is a heat map of both emulations’ TTPs represented on the ATT&CK Framework.

The ATT&CK Framework not only documents known adversarial techniques but also what adversaries have used those techniques, how they’ve used them, and how to mitigate them.

Defenders get answers to some of the most important questions they ask themselves:

  • What adversaries have targeted your specific industry?*
  • What techniques do those adversaries commonly use?
  • What tools have they used to carry out those techniques?
  • Are we capable of detecting those techniques?
  • If not, what preventative controls mitigate this technique?

*While MITRE ATT&CK does not categorize threat groups through victim classification, intelligence on known targets of threat groups is publicly available on each threat group’s page.

For more information on how to start effectively implementing the ATT&CK Framework into your security, read our ATT&CK Framework Guide.

So How Do the MITRE Engenuity ATT&CK Evaluations Evaluate Cybersecurity Products?

Okay. By now you should have a clearer understanding of who MITRE and MITRE Engenuity are, what the ATT&CK Framework is, how it’s relevant, and what the ATT&CK Evaluations are. Now, let’s take a closer look into how MITRE Engenuity evaluates security products.

The Carbanak & FIN7 evaluations were carried out in Microsoft Azure Cloud. MITRE Engenuity provided each vendor two identical eight-host environments. The vendor would then install their solution onto each host and were given complete administrative access. One environment was used for the detection-only portion of the evaluation; the other environment was used for the optional protections test — new this year.

MITRE Engenuity also gave the vendors the optional choice to install server software onto a virtual machine (VM) already in the environment — importing a VM if necessary; this option was also new this year and was the first time Linux-based attacks were used in the ATT&CK Evaluations. The Azure VMs were, by default, Standard B4MS — each with four vCPUs and 16GB memory.

Connectivity to the environment was enabled via VPN; passwords were shared through out-of-band methods. Each environment has one VPN server; vendors used remote desktop protocols (RDP) or secure shell (SSH) elsewhere within the environment. Hosts were only reachable within the VPN and did not have public IP addresses assigned to them via Azure; however, did have Internet access.

The above “financial” domain was used for the Carbanak emulation portion of the evaluation; the below “hospitality” domain was used for the FIN7 portion.

A series of known adversarial techniques are chained together using a logical attack flow. After each technique is executed, MITRE Engenuity records if, and to what extent, the vendor’s cybersecurity product detects the emulated adversary’s attack technique.

MITRE Engenuity classified a “detection” as “any information, raw or processed, that can be used to identify adversary behavior.”

Detections were tagged with the data source(s) that signify the type of data used to generate the detection. These were used to differentiate and provide more precise descriptions of similar detections (e.g., telemetry from file monitoring versus process command-line arguments). The list of possible data source tags was calibrated by MITRE after the execution of the evaluations.

Detection Categories

Each detection is then classified into one of the following categories:

  1. Technique Detection — an enriched detection attributing a specific adversarial technique, e.g., Lateral Tool Transfer (T1570).
  2. Tactic Detection — an enriched detection attributing a specific adversarial tactic, e.g., Command and Control (TA0011).
  3. General Detection — an enriched detection indicating that something was deemed suspicious but did not attribute a specific adversarial tactic or technique.
  4. Telemetry Detection — any raw or minimally processed detection, e.g., process start, file create.
  5. None — None does not necessarily mean that no detection occurred; it means that the detection did not meet the required detection criteria as defined by MITRE Engenuity.

Modifier Detection Types

Detections could further be classified into modifier detection types, which include Configuration Changes and Delayed Detections.

Configuration Changes

Detections could have also resulted from configuration changes. After the initial execution of the attack scenario is completed, vendors have the opportunity to alter their solution. This could be done to show that additional data could be collected and/or processed by the solution. The Configuration Change modifier would then be applied with additional modifiers describing the nature of the change.

There were three types of configuration changes:

  • Data Sources — Changes made to collect new information by the sensor.
  • Detection Logic — Changes made to the data processing logic.
  • UX — Changes related to the display of data that was already collected but not visible to the user.

This means some vendors could have manually altered the investigation flow of their solution (detection logic), altered the display of data that was already collected but in its original state did not make it visible to the user (UX), or altered the sensor to collect new information (data sources).

Delayed Detections

This modifier would be applied if the detection were not immediately made available to the analyst due to various factors.

For more information regarding Detection Categories and Modifier Detection Types, read the official MITRE Engenuity ATT&CK Evaluations page.

What’s New in the Latest MITRE Engenuity ATT&CK Evaluations?

There have been several changes in the ATT&CK Evaluations. One of the biggest changes for Round 3 was the identity of the adversary — or should we say, adversaries. Unlike the previous two rounds, Round 3 uses TTP from not just one adversary but two different adversaries.

Carbanak & FIN7

Although some literature refers to these two financially motivated threat groups as the same group, MITRE, as do we, identify them as two separate entities. Attribution is quite difficult. Carbanak and FIN7 could be the same group, one could be an offshoot of the other, they could share personnel, or the groups could simply share tools and methods.

What is known is that both FIN7 and Carbanak are two of the most damaging financially motivated threat groups an organization in the finance or hospitality industry could face. Carbanak has been cited as being responsible for losses potentially as high as one billion USD, having targeted hundreds of banks in roughly 30 countries. Conversely, FIN7 is known for targeting the hospitality industry (restaurants, hotels) and retail, often utilizing point-of-sale malware to exfiltrate customer credit card records. FIN7’s overall damage is estimated at more than 3 billion USD.

For a thorough breakdown of each emulated attack’s operational workflow, see the MITRE Engenuity-provided resource blow:
Emulated Carbanak Scenario
Emulated FIN7 Scenario

FIN7 and Carbanak are very different from the espionage-focused adversaries from the previous rounds, APT3 (Round 1) and APT29 (Round 2). APT3 is a China-based threat group that researchers have attributed to China’s Ministry of State Security. APT29 is a threat group that has been attributed to the Russian government.

Continuing with the double adversary method, MITRE Engenuity has already announced Wizard Spider (financially motivated group) and Sandworm Team (attributed to Russian GRU Unit 74455) as the adversaries for Round 4.


In order for MITRE Engenuity to evaluate vendor solutions during an intrusion, protections on said solutions needed to be disabled or in alert mode only. For this round, vendors could opt to participate in an additional protection-oriented evaluation. This was the first round this optional evaluation extension was available to vendors.

MITRE Engenuity engineered 10 test cases — five for Carbanak and five for FIN7. In each test case, participants were not allowed to block certain malicious activities, such as lateral movement via pass-the-hash. Evaluators would then begin executing adversarial techniques step-by-step and determine when and if the test case attack would eventually be blocked by the solution.

Linux Enters ATT&CK Evaluations

This is the first round of evaluations that has included non-Windows systems. MITRE Engenuity included Linux into a few attack substeps to begin highlighting vendor capabilities.

Although many cybersecurity vendors have long had Linux detective capabilities (as most of the world’s cloud infrastructure runs on Linux), this portion of the evaluation was completely optional. 22 out of the 29 participating vendors opted to evaluate their Linux detective capabilities.

Vendors who opted out will have their total sub-step count drop from 174 to 162. Sub-steps including Linux-based attacks will be listed as N/A. Therefore, N/A should not be read as a None but instead should be read as out of scope for that solution’s evaluation.

Round and Breach Summaries

MITRE Engenuity now offers round summary tables for each vendor. MITRE Engenuity advises users to not use a single metric to be viewed in isolation or as a static score as one single metric cannot encapsulate the entire performance of one solution.

While the evaluation is detection-based and includes detailed attack summaries, MITRE Engenuity also publicly provides insight into each vendor’s alerting and correlation strategy, which gives end users valuable insight into how these solutions would operate in real-world environments. Screenshots of the vendors’ products were also included, allowing end users to gain even further insight into each solution’s UX.

While this is the third round of evaluations, this is MITRE Engenuity’s first time running the evaluations and openly admits to not only expecting to evolve on what data is captured and exposed in future rounds but also welcomes any feedback.

New Terminology

In regards to this evaluation, “detection” and “visibility” are specific terms defined by MITRE Engenuity. It’s important to remember this when looking at the results.

Here are MITRE Engenuity’s terms for Round 3 and their definitions:

  • Detection — any information, raw or processed, that can be used to identify adversary behavior
  • Detection Count — this is the sum of all raw (telemetry) and processed (analytics) that met our detection criteria. A sub-step can have more than 1 detection.
  • Telemetry — any raw or minimally processed detection (e.g., process start, file create)
  • Telemetry Coverage — the number of sub-steps where telemetry was available
  • Analytic — any processed detection, such as a rule or logic applied to telemetry (e.g., alert descriptions or ATT&CK technique mappings)
  • Analytic Coverage — the number of sub-steps where 1 or more analytics were available
  • Visibility — the number of sub-steps where an analytic or telemetry was available


ATT&CK Evaluations are only in their third round and some in our industry may not be familiar with them. It is our hope that this blog post will enrich your understanding of the MITRE Engenuity ATT&CK Evaluations, their relevance, and their importance to the cybersecurity industry.

Your CyCraft MITRE ATT&CK Reading List

  1. Introduction | What is MITRE ATT&CK?
  2. Behind the Curtain | Who is MITRE?
  3. ATT&CK Evals Round 2 | CyCraft Enters Round 2
  4. ATT&CK Evals Round 2 | Complete Guide to Understanding The Results
  5. ATT&CK Evals Round 2 | CyCraft Results
  6. ATT&CK Evals Round 3 | CyCraft Enters Round 3

Everything Starts From Security

CyCraft Customers can prevent cyber intrusions from escalating into business-altering incidents. From endpoint to network, from investigation to blocking, from in-house to cloud, CyCraft AIR covers all aspects required to provide small, medium, and large organizations with the proactive, intelligent, and adaptable security solutions needed to defend from all manner of modern security threats with real-time protection and visibility across the organization.

Engage with CyCraft

Blog | LinkedIn | Twitter | Facebook | CyCraft

CyCraft secures government agencies, police and defense organizations, Fortune Global 500 firms, top banks and financial institutions, critical infrastructure, airlines, telecommunications, hi-tech firms, SMEs, and more by being Fast / Accurate / Simple / Thorough.

CyCraft powers SOCs using innovative AI-driven technology to automate information security protection with built-in advanced managed detection and response (MDR), global cyber threat intelligence (CTI), smart threat intelligence gateway (TIG) and network detection and response (NDR), security operations center (SOC) operations software, auto-generated incident response (IR) reports, enterprise-wide Health Check (Compromise Assessment, CA), and Secure From Home services. Everything Starts From Security.

Meet your cyber defense needs in the 2020s by engaging with CyCraft at


Everything Starts From Security