Look Ahead Planner

4 min readNov 3, 2022

Authors: Dr. Steven Gianvecchio, Michael Kouremetis & Dr. Andy Applebaum

How Caldera Planners Work

In an operation, an adversary profile determines what abilities are available and a planner decides which abilities to use as well as their order. Caldera’s default planner, the Atomic planner, sends a single usable ability at a time to each agent according to the adversary profile’s atomic ordering. An ability is usable if the agent has an executor for that ability and its facts are satisfied.

Caldera also comes preloaded with two additional planners: Batch and Buckets. The Batch planner sends all usable abilities to each agent. The Buckets planner, a variation of the Batch planner, transitions between different states (or “buckets”) in a state machine and sends all usable abilities from the current bucket to each agent.

These planners work well for most use-cases but are less effective when using large adversary profiles with hundreds of abilities. For example, the new “Everything Bagel” adversary will use every ability that is loaded into Caldera, which by default is over 1,200 abilities.

After running the Everything Bagel adversary with the Atomic planner, we see some limitations in the screenshot below. The planner uses a lot of unrelated abilities and takes a very long time complete. We would get similar results with the Batch or Buckets planners. There are too many abilities to effectively use them all.

Figure 1 — Running Everything Bagel adversary with Atomic planner

Look Ahead Planner

Code

Motivated by these limitations, the Look Ahead planner decides what abilities to use based on anticipated future rewards. It takes as input a table of ability rewards, a depth parameter, and a discount factor. The depth parameter effectively controls the “look-ahead” horizon for the planner as it’s scoring each action, and the discount factor controls how the planner weighs future rewards.

Before describing the algorithm, some notation is outlined below:

Let g be the discount factor. This is defined globally. Default this to 0.9.
Let d be the look-ahead depth. This is defined globally. Default this to 3.
Let the set of abilities be A.
Define a function E : A x P(A) to be a function that maps each ability a in A to the set of abilities that follow a.
We say that ability b follows ability a if there is a parser for a that produces a fact that’s input for b. Tying into language from above, we might have an example where: E(a) = {b}.
Let R : A x N be our reward table, mapping each ability to its immediate reward. The following pseudo-code describes the algorithm:

After running the Everything Bagel adversary with the Look Ahead planner, we see some interesting results in the screenshot below. The planner uses the first ability in several sequences of related abilities (e.g., find Git repositories, create staging directory, collect ARP details).

Figure 2 — Running Everything Bagel adversary with Look Ahead planner

The planner has a default reward of 1 for each ability, so the value of an ability is mainly determined by the length of the sequence of abilities that follow.

Customizing the Look Ahead Planner

We can customize the Look Ahead planner by editing the planner’s ability rewards in the YAML config file, which is located at plugins/stockpile/data/planners/254c7035-de7d-4d76-a888–2c09ba594eca.yml.

If we want the planner to prioritize a particular ability or sequence of abilities, we need to customize the ability rewards. The ability_rewards field contains sub-fields that map from ability IDs to reward values. The example config shown below adds an ability reward of 10 for “Exfil staged directory.”

Figure 3 — Example configuration that customizes **ability_rewards** field within planner

The ability IDs can be found by searching for and selecting an ability from the Abilities tab as shown below.

Figure 4 — Screenshot showing how to select ability

After running the Everything Bagel adversary with customized ability rewards, we see a different result. The planner immediately uses abilities that lead to a high reward ability, “Exfil staged directory.”

Figure 5 — Running Everything Bagel adversary with customized ability rewards

Interestingly, the resulting sequence of abilities reaches the “Exfil staged directory” ability in fewer steps than existing adversary profiles (such as the Thief adversary) and was generated automatically by the planner without this ordering being explicitly specified.

Summary

The Look Ahead planner decides what abilities to use based on anticipated future rewards. We can customize ability rewards so that the planner prioritizes sequences that lead to different abilities, such as exfiltrating files. By prioritizing abilities, the Look Ahead planner enables us to more efficiently run operations that leverage large adversary profiles or even all of Caldera’s abilities.

Resources

Caldera Homepage

Caldera GitHub

Caldera Documentation

Caldera Users Slack

Approved for public release. Distribution unlimited 22–02104–8.

Look Ahead Planner

Written by MITRE Caldera