In programming, a pure function is a function that has the following properties:
- The function always returns the same value for the same inputs.
- Evaluation of the function has no side effects. Side effects refer to changing other attributes of the program not contained within the function, such as changing global variable values or using I/O streams.
Effectively, a pure function’s return value is based only on its inputs and has no other dependencies or effects on the overall program.
Pure functions are conceptually similar to mathematical functions. For any given input, a pure function must return exactly one possible value.
Like a mathematical function, it is, however, allowed to return that same value for other inputs. Additionally, like a mathematical function, its output is determined solely by its inputs and not any values stored in some other, global state.
- The below function is pure. It has no side effects and always returns the same output for the same input.
2. The below function is not pure. It does not always return the same value for the same input, as the output depends on the inputted
x value as well as the internally-computed random value.
3. The below function is also not pure. Even though it always returns the same value for the same input, it has side effects as it modifies the value of the global variable
There are several benefits to using pure functions, both in terms of performance and usability.
Pure functions are much easier to read and reason about. All relevant inputs and dependencies are provided as parameters, so no effects are observed that alter variables outside of the set of inputs.
This means that we can quickly understand a function and its dependencies, just by reading the function’s declaration. So, if a function is declared as
f(a, b, c) then we know that only
c are dependencies of
As all dependencies are provided as input parameters and are not accessed through a global context, these dependencies can be swapped out depending on the context in which the function is called.
This means that the same function can act on different implementations of the same resource, for example.
This makes the code much more portable and reusable as the same function can be used in various contexts, rather than having to write a different function just to use a different implementation of the same class.
For example, instead of having to write two different impure functions to use two different loggers that are stored globally, a pure function would just take in the desired logger as an input.
The lack of side effects makes pure functions very easy to test, as we only need to test that the inputs produce the desired outputs. We do not need to check the validity of any global program state in our tests of specific functions.
In addition, as all dependencies are provided as inputs, we can easily mock dependencies. In an impure setting, we would have to keep track of the state of some global dependency throughout all of the tests.
However, in a pure setting, we would simply provide all dependencies as input. We no longer have to worry about maintaining global state throughout our tests, and we can now potentially provide different versions of dependencies to different tests.
This allows us to test functions while explicitly having control over the provided dependencies in each test.
4. Referential transparency
Referential transparency refers to being able to replace a function’s call with its corresponding output value without changing the behavior of a program.
To achieve referential transparency, a function must be pure. This has benefits in terms of readability and speed. Compilers are often able to optimize code that exhibits referential transparency.
As pure functions always return the same output for the same input, we can cache the results of pure function calls.
Caching refers to using a technique, such as memoization, to store the results of functions so that we only need to calculate them once.
Typically, for a function
f: Input -> Output this is accomplished through a map (such as a hash-map) from
Input -> Output.
When executing a function, we first check if the map contains the input as a key. If it does, we return the map’s output value, otherwise, we calculate
f(input), and then store the output in the map before returning it.