Tail recursion theory made simple

Couple years ago, when I first heard about tail recursion (and functional programming in general), it felt quite unintuitive to me to be honest. Just until recently when I started learning Elixir I finally did understand the concept of it. So what is tail recursion, how does it work, what advantages does it have, and what are the performance differences between it and “normal” i.e. linear recursion?

Examples and comparisons in this article will be written in Elixir, which is a fully functional programming language that supports tail recursion; and Python, which, to some extend, does support functional programming paradigm, but doesn’t support tail recursion, this is to show you that it’s not about how the code is written, but how is it executed.

So let’s consider a very basic example: factorial. There are many ways to define the factorial, one of which is:
n! = n * (n-1)!
This easily corresponds to a very simple recursive procedure:

defmodule Factorial do
def of(1), do: 1
def of(n) when n > 1 do
n * of(n-1)
def factorial_of(n):
if n > 1:
return n * factorial_of(n-1)
return 1

If you do not know Elixir, let me just explain to you, that it uses pattern matching in function definitions, which means that if first (and in this case “only”) argument is 0 ( zero) the first definition will be used; the order of definitions is important, as n might be anything including 0 (zero).

You can surely see that process of computing factorial_of(4) would look like this:

4 * factorial_of(3)=
4 * (3 * factorial_of(2))=
4 * (3 * (2 * factorial_of(1))=
4 * (3 * (2 * (1))=
4 * (3 * 2)=
4 * 6=

Now let’s take a different approach at computing factorial, that is actually an iterative approach:
We will be starting with n = 1 and multiplying it by n + 1, until n reaches the desired number. So this will give as:
product = product * counter
counter = counter + 1
Back to code then, but to simplify the code a bit let’s implement it the other way around, so that the counter is decremented:

defmodule ItrFactorial do
def of(1, product), do: product
def of(counter, product \\ 1) when counter > 1 do
of(counter-1, product*counter)
def itr_factorial_of(counter, product=1):
if counter > 1:
return itr_factorial_of(counter-1, product*counter)
return product

This Elixir code should be fairly easy to understand if you have understood the first example. On the first glance code might look somehow similar, but it is computed completely different, let’s see how that computation could look like:

itr_factorial_of(4, 1)=  # product defaults to 1
itr_factorial_of(3, 1*2)=
itr_factorial_of(2, 2*2)=
itr_factorial_of(1, 3*4)=

Now this looks way more simple, cleaner, has fewer steps, doesn’t recurs back, so we might expect it to be faster, and less memory consuming. Now just a reminder from computer architecture lectures: when a function is called the processor has to jump to it’s code, and the current return address, register values and variables has to be pushed to the stack, that whole pushed data for a single function call is called a stack frame. What does it mean to us? That if computing is not somehow optimized all previews function calls will still be held on the stack. Tail recursion is this optimization!

Can you see the main difference between the recursive and iterative implementations? The first one that probably comes to ones mind is using two arguments instead of just one, but that’s just the side effect of the real thing that causes the iterative implementation to be actually iterative. The key thing is that in the iterative solution the function returns only call to it self or the final product, there’s no more calculation “waiting to be done” as it was in the recursive example when the returned value had to be still multiplied by n. This means that it’s not necessary to save the return address, and processor can use the same stack frame for every call. But this only happens if the language supports tail recursion, as does Elixir.

In Python there’s no support for tail recursion, so the interpreter doesn’t optimize this computation at all. Easiest way to check it is trying to compute itr_factorial_of(n), where n is over 999. This should raise RecursionError: maximum recursion depth exceeded in comparison (in Python 3.5.x, earlier it was ValueError IIRC) which is a clear sign that the stack is growing with the number of iterations and finally exceeding interpreter’s default maximum recursion depth. However in Python there are loops, which aren’t implemented in pure functional languages.

I hope this will bring some light on the topic for people that are just learning functional programming, or are thinking about it.