Behavior tree. Turning a mess into a MES

Evgenii Safronov
Evocargo
Published in
10 min readJul 14, 2022

For the autonomous transportation service at Evocargo

Self-driving vehicles run on software made up of multiple modules — from computer vision to navigation and control. The modules are responsible for separate tasks within the system, but they are all interconnected and have a substantial influence on one another.

Figure 1: A simplified view of a multi-modular system for autonomous vehicles

The dynamics behind this define how a vehicle makes decisions and thereby how it behaves on the road. Whenever a team member adds new conditions and behaviors, we need to be sure that they fit into the overall logic and won’t cause anything unexpected to happen on the road. The more modules, actions, and states we have, the more we need a readable and scalable control structure that clearly displays how any given behavior is executed. Such a structure is often called a Mission Execution System (MES).

I work for the Planning and Control department at Evocargo, and one of my tasks has been to select and build the MES for our multi-modular autonomous driving product. I’ll share my first-hand experience with that in this article and explain why Behavior Trees can be a good choice and when, how they work, and how I managed to get faster reactions from the system.

Let’s go!

BT vs. FSM

There are two popular options for organizing an MES — a Behavior Tree (BT) or a Finite State Machine (FSM).

A Behavior Tree (BT) is a hierarchy of condition and action nodes interconnected from the top down. BTs come from the gaming industry and are perfectly suited to AI and robotics systems, including autonomous vehicles.

A Finite State Machine (FSM) is a behavior model that represents a finite number of states and transitions from one state to another. FSMs are commonly used in device automation, such as that for vending machines and elevators.

Now, let’s see how a given task can be represented both by a BT and by an FMS. Suppose a car gets a command to move from its starting point (SP) to a goal point (GP). Put very simply, the flow might be as follows:

  1. Unpark the car.
    If the car fails to unpark, it takes no further action, and the entire flow is considered to have failed.
  2. Check if the car is at its SP on the road and get it to the SP if it’s not there yet.
  3. Move to the GP.
    If something goes wrong on the way to GP, the entire flow is considered to have failed and stops.
  4. Check if the car is at the GP and get it to the GP if it’s not there yet.

A BT for this example would look like this:

Figure 2: A BT for a car moving from its starting point to its goal point

An FSM for the same example would look like this:

Figure 3: An FSM for a car moving from its starting point to its goal point

Now that we have a general idea of what BTs and FSMs look like, let’s review the differences between them.

Advantages of BTs

There is actually a good number of articles and videos available on the internet comparing BTs and FSMs, and some of these comparisons are very thorough (check out the useful materials at the end of the article). So, I won’t dive too deep into this topic, but rather simply outline a few of the main things that made us choose a BT over an FSM.

1 Visual readability

A BT gives a clear picture of the behavior data in our system — the behavior is composed in an easy-to-follow way. An FSM appears more complicated when we place it next to a BT.

As they are visually simple, BTs are popular with game designers and storytellers who want to change behaviors without involving the developers.

Figure 4: The example of BT visualization at Evocargo

2 Scalability

With a BT, we can seamlessly add new behaviors or remove obsolete ones. Such flexibility allows us to design the behavior incrementally and extend our tree gradually.

In an FSM, the nodes are connected tightly to a specific context or input-output states, so to change one action, we may also have to update many of the states and transitions.

3 Reusability

Reusability stems from modularity — we can easily divide our BT into building blocks and recombine them in new ways. The individual behaviors or subtrees are self-contained and can be moved to other contexts. Such a reuse of nodes is not that easy in an FSM, in which behaviors are tied to states.

4 Safety

In the logic of a BT, it’s easy to set the highest priority to safety checks. We locate them to the left in a subtree and thereby make sure that they will always be executed first.

Now, let’s take a closer look at how BTs are organized starting with the main components of the tree and how they relate to one another.

The logic behind executing nodes in a BT

Let’s return to Figure 2 and see how a BT is organized.

This is just a copy of Figure 2, so you don’t have to scroll up and down

Here we have the subtree move_between(SP,GP)— the one at the top, where the execution sequence starts. It has a root node (top of the picture). Child nodes are connected with arrows. Any node in the behavior tree can be executed and return one of three states (SUCCESS, FAILURE, RUNNING) as the result.

Leaf nodes are at the bottom and have no children. They can be Actions telling the car what to do like unpark or MoveRoad, or Conditions like at(SP) or at(GP). An Action node can return the SUCCESS, FAILURE, and RUNNING states, while a Condition node is usually set to return SUCCESS or FAILURE.

The logic of ticking the tree from the Root down to the Leaves and from one Leaf to another is defined by the control nodes, such as a Sequence node, indicated as [ → ] and a Fallback node [ ? ].

A Sequence node executes its child nodes one by one from left to right, expecting every child to return SUCCESS. For example, the vehicle will start moving to the GP only after it unparks successfully and makes sure that it is at its SP. If FAILURE or RUNNING is returned, the sequence is stopped, and the entire sequence returns FAILURE or RUNNING in turn.

Fallback is similar to Sequence, but it continues to the next child when the previous one returns FAILURE. This means MoveOpenSpace(SP) doesn’t need to be executed if at(SP) returns SUCCESS.

In some cases, we don’t want to tick all the children every time we execute a tree. Then, we can use the Sequence with Memory [ →* ] and Fallback with Memory [ ?* ] control nodes. For example, once we make unpark, we memorize the resulting state of this action and refrain from attempting to re-execute it while performing, e.g., MoveRoad. The memorized state will be cleared when the parent node returns SUCCESS or FAILURE, so the next time the whole tree is activated, all the children will be ticked.

How we made BTs even more flexible with Running Conditions

As I wrote above, Conditions are traditionally set to return either SUCCESS or FAILURE. However, sometimes this is not enough, so, for Evocargo’s self-driving vehicles, we let the Condition return RUNNING state, too. This approach differs from many of the implementations you can find in the literature or open-source frameworks, so let me explain this point with an example.

Let’s consider a typical use of the brake control when a vehicle needs to stop under normal conditions (not an emergency) and switch to a parked state. The parking brake might be called when the vehicle is still moving, so we need to slow it down until the speed is low enough to deploy it.

Figure 5: A Behavior Tree for the brake control system

We set the speed limit to 0 and then check if we have reached this limit, i.e., stopped the vehicle. If the speed is still higher than the limit, RUNNING is returned, and we continue checking this Condition until we get SUCCESS and can deploy the parking brake.

If we chose not to use the RUNNING state in a Condition here, we would merge set(speed limit, 0) and speed < 0.1? into the single Action to slow down to 0. This would also work, but it would be less representative of what is really happening in the system. Therefore, having the opportunity to use a RUNNING state for the Condition nodes gives a clearer logic definition for the system.

When periodically ticking BT is not enough

When designing our BT, we also had to decide how the system would process requests. The traditional approach suggests that a tree is executed periodically and reacts to incoming requests at each periodic ticking.

Now assume we need to implement the feature “discard mission.” It has to be a synchronous call from another module to our BT (e.g., ROS service). A periodically ticked BT is not a suitable solution because it has to respond right at the moment a request arrives.

This case inspired us to develop another approach, in which the tree can be executed periodically, but we can also make an additional BT tick to process each synchronous call (e.g., “discard mission”). The Behavior Tree forms a response for this call in a single tick and can start extra actions (e.g., safely stopping the vehicle). As such, the BT replies to the request without any delay. Receiving an instant reaction from the modules is especially important in emergency situations when a car needs to slow down or stop immediately. That’s why we developed the second, asynchronous, approach. It also provides more freedom in how interactions with the other modules can be designed.

Optimizing ticking time with a Blackboard

BTs are stateless by design. That means that if a chosen behavior depends on a certain state or condition (for example, speed or route), we have to tick the nodes for the info. In a complex multi-modular system, ticking other modules may take time.

To reduce the nodes’ ticking time and thereby speed up the system’s reaction, I’ve suggested using a Blackboard. This is basically a class containing variables and caching the state values from different modules. With it, the system always has up-to-date info on critical parameters, such as speed, position, orientation, and the state of the brake.

Visually editing the BT

Your approach to visualization will depend on who needs to work with the behavior trees.

In gaming companies, designers interact with behavior trees, regularly changing and testing the actions of game characters. As such, there should be a way for them to edit the trees without knowing how to write the code. The common practice is to generate a separate markup file that designers can edit either directly or through a user-friendly visual interface. They mostly only need to compose predefined actions and conditions in a BT scenario. Game designers are not expected to modify the code of actions or conditions as this is done by the developers.

Figure 6: Visualization based on an editable markup file

In robotics, it’s more likely for a developer to update the Behavior Tree structure and add new or edit the existing Actions and Conditions. In this case, supporting a visual editor for a BT might be a waste of time. In contrast, for our autonomous vehicle project at Evocargo, we’ve chosen a more code-oriented approach. That means that our developers create or edit the BT in C++ and then generate a markup file only to feed it to a visualizer and get a nice representation of the behavioral logic.

Figure 7: Code-oriented approach to visualization

Ready to dive deeper?

If you reached this point in the article, then Behavior Trees must have truly sparked your interest. Here are some useful materials for you to find out more about them.

Be sure to take a look at the textbook Behavior Trees in Robotics and AI by Michele Colledanchise and Petter Ögren. It offers broad research on BTs and their design principles.

On Petter Ögren’s YouTube channel, you can find many videos about BTs, including one comparing BTs and FSMs.

Check out Mina Pêcheux’s channel to learn about the use of BTs and FSMs in game dev and many other interesting things.

Conclusion

This was an overview of Behavior Trees based on my experience of organizing a Mission Control System for autonomous vehicles at Evocargo. We looked at the advantages of BTs over FSMs and the basics of BTs execution.

We also looked into different methods of optimizing and speeding up your entire system’s reaction time — the execution of nodes with memory, adding the RUNNING state to Conditions, the periodic and asynchronous ticking of a BT, and the use of a Blackboard.

In the end, I explained my views on the benefits of visualization as per different BT use cases.

I hope the lifehacks I shared in this post will help you create a readable and flexible control logic structure for your product.

--

--