# Pure & impure functions

Dec 4, 2019 · 6 min read

The concept of pure and impure nodes are, at their very core, super simple and really intuitive after a few sessions of playing with the engine. However, as a project grows larger, it’s easier and easier to fall into traps, which introduce either bad performance or unexpected results, given that you don’t know how the underlying system works. In this article I’ll try to shed some light on the evaluation of pure and impure functions.

# The basics

Most of you already know the difference between the two types of nodes, but just for the sake of clarity, let me explain it very shortly.

Basically, impure nodes are the ones that have execution pins and are executing actions, while pure nodes are used to parameterize these functions. Of course, we’ll see exceptions, but it is the core concept.

In the example above, is something you “execute”, while , and are only used to construct the inputs of this call.

We can easily switch between these two types by setting a flag in the property panel of a function.

# Evaluation

The important thing about the evaluation is that for each impure call, every connected pure node is calculated exactly once, from the leaves of the graph to the pins of the impure node. Let’s break this down.

In the graph above, although is used in the calculation 3 times in total, it’s only evaluated once. It’s really important to understand as it can have consequences to the result of your calculation if your pure nodes are non-deterministic.

Wait, what’s a deterministic function?

It only means, that for a given input it always gives the same result. For example in math, addition and subtraction are deterministic. 2+2 is always 4 (in the decimal system at least), whatever happens with the world. It is not true for example for a function which queries the current time in nanoseconds in the given timezone from the internet: it does not always return the same value for a given time zone (as time linearly passes).

Now, this out of the way, let’s see an example where non-determinism can impact the logic of your pure nodes if you don’t pay attention.

The logic of the graphs above look the same: all we did in the bottom one is to use 3 different nodes instead of connecting only one to 3 different pins. In reality though, they’re gonna have a different results: in the first case only one random number is going to be generated, which is then used for all the calculations, while in the second case 3 different numbers are going to be generated, hence, (very likely) resulting in a different output.

Another thing in our statement which is also extremely important: for each impure call are the pure nodes evaluated.

Let’s consider the path finding really slow (as it probably would be in a real game). The example above would be really non-optimal, as we would run the same path finding two times! Once for the , and once for the , as pure nodes are always evaluated for each impure call. It can be easily mitigated by using an impure node instead:

And that’s all. The simplicity is given by the fact that impure nodes cache their results, so you can use them a thousand (or more) times without recalculating anything. Naturally it has a drawback: it goes against the principle of “impure nodes for executing actions, pure ones for setting up the parameters”. In this case, we are only using impure for its caching side-effect. Not as nice — but more efficient nevertheless.

Going further, let’s see another horrible example, which one can see too often:

Looks quite average, doesn’t it? The problem is, is just a macro and inside, it looks like this:

A bunch of impure nodes! Due to the way macros work, they don’t cache any input or output (even if they can look like impure nodes with execution pins!), as they are simply copy-pasted to your graph during compile-time. It means, whatever input a macro it has, it’s going to evaluate it every time it’s used in an impure function. In the case of , inputs are going to be evaluated 2n+1 times (follow the execution in the macro if you don’t believe me)! It means that if you have for example 8 matching components in the example above, you are going to query, filter and copy the array containing the components 17 times! That’s just horrible for performance, especially for expensive functions. Always use variables or the result of impure functions when you are using a blueprint loop (and you care about performance at all).

Alright, let’s get back to our original graph:

The third part of our statement was that nodes are evaluated from the leaves to the root. In this example the sequence of the evaluation would be like this: . In general, it’s common sense and it doesn’t really matter; it’s still worth to mention in some cases though, especially when some pure calls are non-constant.

Non-constant?

Constant means that it does not change the entity to which the function belongs: it does not modify its variables, and it does not call other functions, which might modify the object’s state. To name some examples, , and are constant functions, they only get / calculate data, they do not modify anything. On the other hand, , and are all non-constant, as they do modify the entity they belong to. You can mark any function in the detail panel to enforce a function to be constant (which is a great idea for getters and math for example).

As you’d guess, most pure functions are constant: you use them to build the parameters of the impure nodes, so they shouldn’t really change anything. And that’s the case in 99%. There are some exceptions though, for example singletons. I don’t want to go deep in programming paradigms, so instead, let’s just check this fairly simple graph:

All it does is that it returns an entity if it exists, otherwise it creates it, and then returns it (it’s called a singleton). The point is that until you don’t need the entity, it doesn’t exist, however, when you start working with it, it’s only going to be created once, and the same created entity will be returned all over again. This way, the following graph will be completely okay:

Evaluation starts from the leaves (nodes without inputs), so even though in the beginning of the graph may not exist yet, you can be sure that is executed first, so you’ll have a valid object before getting or setting the on it.

So… somewhat shortly that’s how pure and impure nodes are related to each other. If you feel like I didn’t explain something well enough, you find mistakes or miss something from the article, let me know!

Cheers!

Written by

Written by