What a Cluster! How Industry Groups and Names Threat Activity Clusters

7 min readAug 19, 2024

What a Cluster! How Industry Groups and Names Threat Activity Clusters

The concept of clustering together threat activities/intrusion activities and providing a name to the observation for the purpose of cataloging, describing, and correlating future activity is not new. However, over the past two years, this convention has become increasingly common with security researchers working for vendors citing the internal naming convention they use to track these intrusion clusters in conference talks, blogs, and social media posts, amongst other mediums.

This has inadvertently led to some confusion for analysts who are newer to the cyber threat intelligence discipline and even with some peers in other cyber security disciplines, scratching their heads trying to figure out what this new nomenclature represents and how it varies from threat actor group names like APT29 or MIDNIGHTBLIZZARD. The purpose of this blog is to capture the nomenclature used by vendors for their intrusion tracking ontology. The vendor nomenclature crosswalk I have created at the end of this blog is likely incomplete, but should serve as a starting point with the hope that other researchers can help fill in the gaps. (Read: if you are a researcher, reach out to me, and provide context I will update this blog, provide you with proper credit, and we can be collectively smarter).

What is an Intrusion Cluster?

To quote Anne Rice’s protagonist in Interview with a Vampire, “shall we begin like David Copperfield–’I am born, I grew up’ — or shall we begin when I was born to darkness, as we call it. That’s really where we should start, don’t you think?” So let’s start by defining threat/intrusion activity and threat/intrusion clusters/sets.

Intrusion activity represents any artifact of an intrusion — successful or unsuccessful attempt — to include the raw indicator(s) of compromise and higher order abstractions to tactics, techniques, and procedures (TTPs). During an intrusion, especially a successful one — known also as a compromise or breach — an adversary often undertakes multiple actions on a victim’s systems with a specific order of operations (think behaviors or what we call adversary tradecraft). All of these artifacts of an intrusion are viewed singularly as a composite object that provides a baseline definition of which we create an initial intrusion cluster.

We then use that definition moving forward as a point of comparison to correlate similar activities that allow us to build out our understanding of the threat activity through a process called merging. Some intrusion activity contains more unique fingerprints, so we emplace a higher weighted value — called a key indicator, anchor, or toolmark — particularly when we are evaluating whether one intrusion cluster shares enough overlaps with another to merge them together.

Note #1: an intrusion cluster is sometimes referred to as an intrusion set or set of intrusion activity to represent the composite nature of its multiple elements. Threat cluster or set of threat activity is often used synonymously.
Note #2: About four months ago in April, @thesilence from the Vertex Project wrote an excellent blog post to help educate on intrusion clusters and about two weeks ago, he wrote a follow up blog on identifying intrusion cluster merge candidates for Synapse, their centralized intelligence system.
Notes #3: Sophos’ Morgan Demboski (@MorganDemboski), Paul Jaramillo, and Mark Parsons (@securitydumpstr) provided an excellent visual overview of cluster overlaps they observed when investigating suspected Chinese intrusion activities on a network. They reported their findings in the Operation Crimson Palace report and beefed up the visual characteristics and tracking methodology in their Surfacing a Hydra: Unveiling a Multi-Headed Chinese State-Sponsored Campaign Against a Foreign Government Black Hat 2024 presentation.

Names are Definitions

Why provide names to intrusion clusters in the first place? There’s a handful of reasons, but the most simplistic reason is that it is easier to communicate if we use a name or some shorthand for the activity than to described the full threat activity each time. For instance, UNC#### might be represented as the intrusion activity denoted by the use of Powershell to call WMI for internal discovery followed by SMB to perform network discovery, PsExec for lateral movement, and WinRAR using the -hps parameters in combination with a unique password for data archiving and staging. So instead, a shorthand name allows us to capture the composite of the activity using a standardized nomenclature that is often incremented when new intrusion clusters are discovered.

Most organizations will use a shorthand name followed by upwards of four digits incremented sequentially starting at 0001. I believe only one or two I have come across use a three-digit numeric convention. See the table below for examples.

There are other reasons behind this, too, which @cyb3rops lays out in his 2018 blog on threat actor naming conventions, so rather than repeat those here, I’ll refer you to his Newcomer’s Guide to Cyber Threat Actor Naming post.

From Intrusion Cluster to Threat Actor Group

The thing about intrusion clusters is that they evolve over time as the baseline of activity changes. This could include the integration of a new preferred tool, hosting provider, malware configuration decisions, etc. When threat activity deviates significantly, this may require creating a related cluster to denote the splintering into two distinct activities. Similarly, we might determine that two sets of activities hold enough overlaps that we feel confident in making them into one group.

Most organizations will leverage a qualitative approach, though a quantitative scoring system can be used. In reality, the deliberations often turn into identifying unique aspects that are used to anchor upon, which in effect, creates a higher weighted value, so while not formally documented, they are, in effect, using a quantitative approach.
As a matter of principle, I will foot stomp the necessity to always capture timestamps related to intrusion activity to assist in identifying activity convergence and divergence points.
The decision threshold for merging clusters varies from organization to organization, but should use a standardized methodology that is agreed upon, accepted internally, and followed in all cases.

Most of the cyber security vendors and especially those in the threat intelligence field follow roughly the same process for eventually graduating a threat cluster to a threat actor group though the decision criteria is largely veiled from the general public. In some instances, vendors will provide insights into the basis for graduating a group in public announcements, blogs, even conference talk. Alternatively, these insights can more often be found in reflections from researchers or peers who worked or were aware of the graduation’s innerworkings.

One example where a toolmark shared overlaps across intrusion clusters was with the Solarwinds software supply chain operation where the group used a Golden SAML technique to persist in Azure cloud environments, amongst other notable characteristics. This toolmark along with some of those other unique intrusion characteristics were enough for Mandiant to link together UNC2452 with clusters UNC2652 and UNC3004. The full blog on intrusion cluster overlaps can be found here. Eventually the clusters were merged into APT29.
Another good example comes from Emeil (@EHaeghebaert) from Microsoft’s Threat Intelligence team (MSTIC) where he shows cluster overlaps and divergence in what is known publicly as APT42:

The following table represents how vendors communicate whether they are describing an intrusion cluster or a threat actor group. Where possible, I’ve provided additional commentary and links to either examples or how the vendor has publicly described this otherwise the table comes from industry peers with firsthand access to the information.

Note: During my research, I discovered that Trend Micro uses Void[X] for threat groups whose motivations are not confirmed or mixed, but it was unclear if this designator is for intrusion clusters or threat actor groups. Ergo, it was not included in the table below.

Intrusion Clusters Vendor Naming Convention

The first number designator following the STAC designation denotes suspected motivation with: 1 representing…

docs.google.com

Parting Thoughts and Shout Outs

To make these insights more accessible to the public, @cyb3rops kindly integrated the vendor conventions that were not previously captured into his APT Groups and Operations Google Sheet in the “Taxonomies” tab.

One of the best presentations I have come across that illustrates tooth-to-tail methodology for clustering, especially when dealing with complexities such as malware-as-a-service is in Sophos’ Morgan Demboski’s SANS CTI Summit 2024 presentation,: “Clustering Attacker Behavior: Connecting the Dots in the RaaS Ecosystem”. I cannot recommend it enough for those looking to upskill in understanding how to enumerate characteristics and create a correlation framework. I was stoked to meet Morgan at SLEUTHCON and let her know how much that presentation slapped.

I would like to conclude by providing kudos to @jamieantisocial, @_devonkerr_, @greglesnewich, @ValidHorizon, @lasq88 who responded to my Twitter post where I attempted to validate and crowdsource intrusion cluster names used by vendors. Additional thanks go to @markpars0ns, @cyb3rops, @k3yp0d, @AugustVansickl2, and Kat Metrick.

Intrusion Clusters Vendor Naming Convention

The first number designator following the STAC designation denotes suspected motivation with: 1 representing…

Written by Shinigami