Sharpen your Simulation Game Part 1 - Introduction

Mauricio Velazco

Published in

Open Threat Research

6 min readAug 5, 2020

In part 1 of this series I would like describe the challenges PurpleSharp aims to address.

The other two parts can be found in the following links:

Sharpen your Simulation Game Part 2 — Enter PurpleSharp
Sharpen your Simulation Game Part 3— Demos

Defending enterprise networks against attackers continues to present a difficult challenge for blue teams. Tackling the problem by deploying prevention controls is still effective and relevant, however, blue teams are complementing this approach by investing resources in improving detection capabilities.

Enter Detection Engineering

Detection engineering is the continuous process of building, deploying, tuning and operating detection analytics with the goal of finding threats while applying engineering concepts. It is, in my opinion, a great approach to catch adversaries that may have bypassed the first prevention layers as they operate executing code, deploying persistence, or moving laterally across the environment. I highly recommend reading about Red Canary’s approach to detection engineering.

Effectively collecting, transforming, normalizing, indexing and finally querying the relevant endpoint & network telemetry with a robust analytics engine is a challenging task that requires a complex architecture and the right people-technology-process combination. A healthy detection engineering program requires for several moving parts work in synergy:

the data source generating the proper events
the event pipeline promptly collecting/delivering these events
the right schema being enforced
the analytics engine executing the detection logic
the incident triage platform generating cases for analyst review

If one of these fails to operate, there is no detection, no triage and no response. Yet, enterprise environments are subject to constant change that endangers this synergy. How do you know if the complex detection analytic you built and deployed for technique TXXXX is still working after server patching, ACL changes, GPO updates, software deployments, or any other change ?

Detection Testing

Blue teams need to test the deployed detections in production environments and confirm they are still operating as expected in the context of a changing environment. There are two main ways of achieving this:

Unit Testing

Validate the detection logic by directly injecting to the event pipeline the telemetry produced with the execution of attack techniques.

A project that maintains pre-recorded events in JSON format is Mordor, created by my good friends Roberto and Jose Luis Rodriguez. Samir also has a really interesting project that provides events in EVTX and PCAP formats: EVTX-ATTACK-SAMPLES and PCAP-ATTACK.

Functional Testing

Validate the detection by executing attack behavior against a subset of endpoints in the monitored environment. I.e: adversary simulation or emulation.

I like Tim MalcomVetter’s approach to differentiate these terms: emulation aims to mimic the behavior, modus operandi, and sometimes specific signatures of a particular threat actor while simulation tends to be more technique-generic. E.g, emulate APT 29’s TTPs vs simulate common initial access techniques. Across this post series, I will use the term adversary simulation to refer to functional testing.

To implement functional testing, we can leverage tools and frameworks built for running red team operations like crackmapexec, Empire, Metasploit, impacket, etc. In the past 3–5 years we have seen a surge in the release of a new type of tool, one that aims to simulate techniques with the goal of detection/security posture testing and typically not weaponized: APTSimulator, Metta, Red Team Auomation, Purple Team Attack Automation, Invoke-AtomicRedTeam, DumpsterFire, between others. Jorge Orchilles compiled a comprehensive list here.

While trying to approach detection validation at $DayJob, we decided to focus on functional testing in the form of adversary simulation, as it provides end-to-end validation and allows blue teams to also identify issues with the event pipeline. For example, a misconfigured GPO that breaks a WEF subscription on endpoints can be missed if we only focus on testing the detection logic with unit testing.

Our initial operations were manual, with open source offensive frameworks like Metasploit, PoshC2 or Empire and their post-exploitation modules. As the number of detections grew and our testing expanded to include weekly detection validation, the manual process became impractical. We needed tooling to automate the process.

We found some of the previously mentioned projects in the open source space ready to be used in simulations. Having tested some of them, I noticed their capabilities and limitations. I got really excited about the general concept and decided to embark on a coding project that would meet our specific requirements and use cases.

The first iteration of this idea, PurpleSpray, was released at BsidesCharm 2019 and focused only on one use case: password spraying. I moved to C# and released a Beta version with support of more techniques at Derbycon 9.0, PurpleSharp. Today, as part of my BlackHat 2020 Arsenal presentation, I’m happy to release a new version of PurpleSharp with a lot of new features and the corresponding documentation.

PurpleSharp is an open source adversary simulation tool written in C# that executes adversary techniques within Windows Active Directory environments. The resulting telemetry can be leveraged to measure and improve the efficacy of a detection engineering program. PurpleSharp leverages the MITRE ATT&CK Framework and executes different techniques across the attack life cycle: execution, persistence, privilege escalation, credential access, lateral movement, etc. It currently supports 37 unique ATT&CK techniques.

Detection engineering programs can use PurpleSharp to:

Build new detection analytics
Test existing detection analytics
Validate detection resiliency
Identify gaps in visibility
Identify issues with event logging pipeline
Verify prevention controls

Why PurpleSharp?

As mentioned above, there are several options for adversary simulation in the open source space. In this section I'd like to highlight the specific use cases PurpleSharp aims to address.

Simple Deployment

One single .NET assembly, no VMs, no C2 channels, no implants.

Flexible Remote Simulations

PurpleSharp is able to deploy simulations on remote hosts leveraging administrative credentials and native Windows services/features such as SMB, WMI and RPC. With this feature, operators can run simulations on remote locations and verify detection across the environment and not always from the same fixed infrastructure.

Opsec Considerations

PurpleSharp’s goal is to allow blue teams to verify detection controls are working as expected. However, orchestrating the execution of adversary behavior on remote hosts requires engaging them in ways that may trigger a detection. PurpleSharp leverages a couple of techniques to try to avoid this including Parent PID Spoofing. Using PPID Spoofing effectively breaks the parent-child process relationship between the simulation deployment and the actual simulation.

Credible Simulations

A side effect of using the Parent PID Spoofing technique allows PurpleSharp to execute simulations in the context of the logged user on the remote simulation target. Simulations run as child processes of explorer.exe and from within the users profile. This produces a credible simulation that mimics a real compromised user on a production endpoint that may have clicked on the wrong link.

Random targets

PurpleSharp leverages LDAP queries to identify and randomly pick suitable simulation targets. Running simulations against random targets helps to identify issues with the event pipeline and verify detection coverage across the environment.

Diverse Attack Scenarios

Where possible, I have tried to implement different variations of the same technique. The goal of this is to confirm detections are resilient and identify gaps in detection coverage. E.g, A password spray attack using Kerberos looks completely different on the logs compared to a password spray using NTLM. Can you detect both ?

For more information on these points or other features/capabilities, visit the documentation.

In Part 2 of this series, I will describe PurpleSharp’s architecture and how it deploys simulations.