Alerting and Detection Strategy Framework

Palantir
Palantir
Dec 19, 2017 · 11 min read
Image for post
Image for post

Alerting and Detection Strategies (ADS) are one of the critical requirements for an effective Incident Response (IR) and Detection team. Well-functioning IR teams invest time and energy into development of new hypotheses for alerting. When done well, a new ADS can provide very strong signal into anomalous and potentially malicious activity. When done poorly, an IR team can flood themselves with low-signal alerts, create fatigue, morale issues, and ultimately generate a net-negative impact for their mission of detecting and eradicating evil.

This blog post details how Palantir’s IR team develops ADS, the process framework we use, and the pitfalls we have experienced in early alert development. Additionally, we are releasing example ADS on our GitHub repository. The GitHub project provides the necessary building blocks for adopting this framework for organizations looking to improve the efficacy of their detection strategies, and we hope that sharing a subset of our internal ADS can help inspire discussion and sharing of alerts more broadly.

The Case for a Framework

An ADS framework is a set of documentation templates, processes, and conventions concerning the design, implementation, and roll-out of ADS. Prior to agreeing on such a framework, we faced major challenges with the implementation of alerting strategies: The lack of rigor, documentation, peer-review, and an overall quality bar allowed the deployment of low-quality alerts to production systems. This frequently led to alerting apathy or additional engineering time and resources to fix and update alerts.

Some potential issues experienced by ad-hoc or immature alert development include:

The alert does not have sufficient documentation.

  • The on-call engineer may not be familiar with the software or target OS.

The alert is not validated for durability.

  • Alerts may be based on narrow criteria, which creates false negatives.

The alert is not reviewed prior to production.

  • Change control may not be required.

ADS Framework

Our ADS framework was designed to mitigate the above-mentioned issues. It helps us frame hypothesis generation, testing, and management of new ADS. The framework has the following sections:

  • Goal

Each section must be completed and reviewed prior to deploying a new ADS. This guarantees that any given alert has sufficient documentation, is validated for durability, and is reviewed prior to production deployment. All of our production or draft alerts are documented according to this template, and the documentation is stored in a durable, version-controlled, and centralized location (e.g., Wiki, GitHub entry, etc.).

Subsequent changes to the ADS, such as modifying a whitelist, altering the strategy, or noting additional blind spots, should be incorporated into the ADS documentation at the time of modification. Internally, we use a model similar to GitHub where any feature request or bug is opened as an issue for the respective ADS. One of the on-call engineers will triage the respective bug or feature request, make the change, and then update the ADS documentation before closing the issue out. This process ensures that ADS documentation stays current while leaving an audit trail of changes.

Let us now look at the sections in more detail.

Goal gives a short, plaintext description of the type of behavior the ADS is supposed to detect.

Categorization provides a mapping of the ADS to the relevant entry in the MITRE Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) Framework. ATT&CK provides a language for various post-exploitation techniques and strategies that adversaries might use.

Mapping to the ATT&CK framework allows for further investigation into the technique, provides a reference to the areas of the killchain where the ADS will be used, and can further drive insight and metrics into alerting gaps. In our environment, we have a knowledge base which maps all of our ADS to individual components of the MITRE ATT&CK framework. When generating a hypothesis for a new alert, an engineer can simply review where we are strongest — or weakest — according to individual ATT&CK techniques.

Strategy Abstract is a high-level walkthrough of how the ADS functions. This describes what the alert is looking for, what technical data sources are used, any enrichment that occurs, and any false positive minimization steps.

Technical Context provides detailed information and background needed for a responder to understand all components of the alert. This should appropriately link to any platform or tooling knowledge and should include information about the direct aspects of the alert. The goal of the Technical Context section is to provide a self-contained reference for a responder to make a judgement call on any potential alert, even if they do not have direct subject matter expertise on the ADS itself.

Blind Spots and Assumptions are the recognized issues, assumptions, and areas where an ADS may not fire. No ADS is perfect and identifying assumptions and blind spots can help other engineers understand how an ADS may fail to fire or be defeated by an adversary.

False Positives are the known instances of an ADS misfiring due to a misconfiguration, idiosyncrasy in the environment, or other non-malicious scenario. These false positive alerts should be suppressed within the Security Information and Event Management tool (SIEM) to prevent alert generation when a known false positive event occurs.

Validation lists the steps required to generate a representative true positive event which triggers this alert. This is similar to a unit test and describes how an engineer can cause the ADS to fire. This can be a walkthrough of steps used to generate an alert, a script to trigger the ADS (such as Red Canary’s Atomic Red Team Tests), or a scenario used in an alert testing and orchestration platform.

Priority describes the various alerting levels that an ADS may be tagged with. While the alert itself should reflect the priority when it is fired through configuration in your SIEM (e.g., High, Medium, Low), this section details the criteria for the specific priorities.

Responses are the general response steps in the event that this alert fired. These steps instruct the next responder on the process of triaging and investigating an alert.

Additional Resources are any other internal, external, or technical references that may be useful for understanding the ADS.

Case Study: Alerting for Python Empyre

Now that we have walked through the ADS Framework, we will describe how an IR engineer builds and deploys a production alert. In this example, we’ll look at development of an ADS focused on identifying Python Empyre, a MacOS implant, on compromised workstations. In this scenario, the following steps occur:

  1. A catalyst (e.g., Tweet, article, observed actor activity) inspires an alert idea for an IR engineer. In this instance, the engineer reads an article about Python Empyre and, since we have a large fleet of Macs, decides this is a detection strategy worth pursuing.

The IR engineer looks at process execution events in the analysis environment, and identifies the specific commands executed by Python Empyre which looks for Little Snitch:

/bin/sh -c ps -ef | grep Little\ Snitch | grep -v grep

Digging further into the source code for Python Empyre reveals the explicit code for this check:

try:
if safeChecks.lower() == 'true':
launcherBase += "import re, subprocess;"
launcherBase += "cmd = \"ps -ef | grep Little\ Snitch | grep -v grep\"\n"
launcherBase += "ps = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)\n"
launcherBase += "out = ps.stdout.read()\n"
launcherBase += "ps.stdout.close()\n"
launcherBase += "if re.search(\"Little Snitch\", out):\n"
launcherBase += " sys.exit()\n"
except Exception as e:
p = "[!] Error setting LittleSnitch in stager: " + str(e)
print helpers.color(p, color='red')

The IR engineer decides to refine their alert focusing more broadly on queries for Little Snitch invoked by the grep binary:

grep Little\ Snitch

The IR engineer annotates this behavior and performs a historical search through their SIEM for any exact matches for this artifact. Looking through the last 90 days of process execution activity only surfaces process execution hits for their testing activity, and some hits for systems management or updater software. This indicates that the alert is indicative of high-fidelity in that there do not appear to be any legitimate invocations of the same command in the environment.

The IR engineer annotates the whitelist entries and drafts an alert for their SIEM (e.g., a scheduled query, a transform, etc.) to look specifically for the command invocation. The ADS template is filled out with a first draft of each section.

At this juncture, the IR engineer is at an ADS review checkpoint. The reviewer performs an inspection of the alert and validates that the ADS is functional from source to SIEM, to include true positive testing. The ADS is given a rule ID (RID) used for internal tracking and the IR engineer starts the move to production deployment. While generating more work than simply writing an alert, this provides several benefits for the quality of the ADS:

  • The ADS must be fully complete which guarantees a baseline of documentation.

The completed ADS for the Little Snitch case study is as follows:

Goal: Detect attempts by potentially malicious software to discover the presence of Little Snitch on a host by looking for process and command line artifacts.

Categorization: These attempts are categorized as Discovery / Security Software Discovery.

Strategy Abstract: The strategy will function as follows:

  • Record process and process command line information for MacOS hosts using endpoint detection tooling.

Technical Context: Little Snitch is an application firewall for MacOS that allows users to generate rulesets around how applications can communicate on the network.

In the most paranoid mode, Little Snitch will launch a pop-up notifying the user that an application has deviated from a ruleset. For instance, the following events could trip an interactive alert:

  • A new process is observed attempting to communicate on the network.

Due to the intrusive nature of Little Snitch popups, several MacOS implants will perform explicit checks for processes, kexts, and other components. This usually manifests through explicit calls to the process (ps) or directory (dir) commands with sub-filtering for Little Snitch.

For instance, an implant could look for the following components:

  • Running Little Snitch processes

Blind Spots and Assumptions: This strategy relies on the following assumptions:

  • Endpoint detection tooling is running and functioning correctly on the system.

A blind spot will occur if any of the assumptions are violated. For instance, the following would not trip the alert:

  • Endpoint detection tooling is tampered with or disabled.

False Positives: There are several instances where false positives for this ADS could occur:

  • Users explicitly performing interrogation of the Little Snitch installation, e.g., grepping for a process, searching for files.

Known false positives include:

  • Little Snitch Software Updater

Most false positives can be attributed to scripts or user behavior looking at the current state of Little Snitch. These are either trusted binaries (e.g., our management tools) or are definitively benign user behavior (e.g., the processes performing interrogation are child processes of a user shell process).

Priority: The priority is set to medium under all conditions.

Validation: Validation can occur for this ADS by performing the following execution on a MacOS host:

/bin/sh -c ps -ef | grep Little\\ Snitch | grep -v grep

Response: In the event that this alert fires, the following response procedures are recommended:

  • Look at management tooling to identify if Little Snitch is installed on the host.
    - If Little Snitch is not installed on the Host, this may be more suspicious.

This ADS record may also be found on our public GitHub repository.

More ADS Examples

The ADS Framework can be found in a repository on our public GitHub repository, and we have also included several internal ADS that we have developed as examples for reference.

We encourage other organizations to develop their own unique ADS and find a mechanism to share them with the broader InfoSec community. While there are operational security considerations around publicly acknowledging and documenting internal alerts, we hope these examples spur greater sharing and collaboration, inspire detection enhancements for other defenders, and ultimately increase the operational cost for attackers.

Further Reading and Acknowledgements

We would like to extend thanks to the following for their contributions to the InfoSec community, or for assisting in the development of the ADS Framework:

Authors and Contributors
Art W., Chris L., Dane S., Joshua B., Richard S.

Palantir Blog

Palantir Blog

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store