From Recursion to Very Large Numbers: The Ackermann function

Matthew A. Pagan
The Modern Scientist
12 min readFeb 10, 2023

Serbian paramilitary insurgents assassinated the heir to the throne of Austria-Hungary when Wilhelm Ackermann was only half-finished with his second semester at Georg August University of Göttingen. When the German Empire declared war on Serbia’s ally Russia in August of 1914, Ackermann had only completed one full semester of physics, philosophy, and math courses at the University of Göttingen.

Wilhelm Ackermann, (1896–1962). Public Domain.

The Second Reich’s subsequent mobilization of troops led to the conscription of all German men between the ages of 17 and 45. Wilhelm Ackermann, then 18, dutifully returned to his hometown of Herscheid, nearly 100 miles away, to register for this draft. Imperial German conscription rules required Ackermann to present himself between Jan 15 and Feb 1 of 1915 for the draft lottery. The Ober-Ersatzkommission notified young Ackermann that his military service would begin on October 1st of that year.

His selection into the army forced him to quit his studies at Göttingen. His three-year military service, which he completed either as part of the cavalry or the mounted artillery, concluded just one month before the abdication of the Kaiser led to the formation of the Weimar Republic and the signing of the Compiègne Armistice that ended the war.

A discharged Ackermann re-enrolled at the University of Göttingen in 1919 with a decision to focus on the field of Mathematics.

Wilhelm Ackermann completed his PhD in 1925 as an advisee of David Hilbert. Ackermann’s doctoral thesis, “Begründung des “tertium non datur” mittels der Hilbertschen Theorie der Widerspruchsfreiheit (Foundation of the ‘tertium non datur’ using Hilbert’s theory of consistency)” built on the so-called Hilbert program, the task of axiomatically defining the foundations for all of mathematics. After Kurt Gödel published his Incompleteness Theorems in 1931, further research pursuant to the Hilbert Program could only be done with substantial revisions to Hilbert’s original proposal.

Origins of the Ackermann function

As a postdoctorate, Ackermann continued to work under David Hilbert, who procured for him a postion rewriting his lecture notes into book form. As the funding for this work depleted, Wilhelm Ackermann entered the teaching profession outside of academia as a math instructor at the Gymnasien in Burgsteinfeld, a status equivalent to a preparatory high school teacher.

Nevertheless, Ackermann continued his mathematical research, especially in the verdant field of computation. In 1928, Ackermann published “Zum Hilbertschen Aufbau der reellen Zahlen”, which would not become available in English for another thirty-nine years. Jean van Heijenoort published an anthology of mathematical translations in 1967, including this paper by Ackermann, which he translated as On Hilbert’s construction of the real numbers. Ackermann introduced in this paper a mathematical function now known as the “three-variable Ackermann function”.

The incumbency of recursion in the definitions of functions remained an open question in the early twentieth century. When Richard Dedekind defined recursive functions as a unique computational class in his 1888 work, Was sind und was sollen die Zahlen? (What are numbers and what should they be?), he did so under the assumption that any function which could be defined in terms of itself (i.e., recursively) could also be defined in iterative terms (i.e., without recursion). Ackermann’s paper demonstrated an example function which could only be defined recursively.

It was Rózsa Péter who coined the term “primitive recursive” in a 1932 presentation to the International Congress of Mathematicians in Zürich, differentiating between functions such as the Fibonacci, which could be defined recursively or iteratively, and functions such as Ackermann’s, which had no iterative definition.

The Ackermann function was not the first counterexample to this assumption (it was the second), but it did become the best-known. A fellow graduate student of Hilbert’s, Gabriel Sudan, published his definition of the Sudan function in 1927, a year before Ackermann published his own similar findings. Together, the Sudan function and the Ackermann function settled the theoretical ambiguity in the negative: the set of functions eventually known as primitive recursive was not identical with the set of all recursive functions.

Properties of the Ackermann function

Software developer Adam Kozicki blogged a reasonable summary of where the Ackermann function fits among the definitions of computatability theory:

[T]he Ackermann function is a total computable function (which means it is recursive) but is not primitive recursive (which means it has no iterative alternative).

Put another way, recursive functions are a superset of primitive recursive functions. Given that the Ackermann function isn’t primitive recursive, but is recursive, we can say:

  • The Ackermann function has a formal definition that references the Ackermann function.
  • There is no way to define the Ackermann function without referencing the Ackermann function.
  • There is no way to define the Ackermann function, using the convention of while-loops or their special case, for-loops.

The Ackermann function demonstrated that a well-defined, total, computable function does not need to be primitive recursive. These concepts of “total function” and “computable function” describe qualities that have definitions worth belaboring if we want a fuller understanding of the significance of Mr. Ackermann’s discovery.

Any computable function can be solved using the rules of the theoretical Turing machine. Furthermore, the Church-Turing thesis established in 1936 that any algorithm computable by a Turing Machine (and which will eventually halt) can alternatively be solved with Lambda calculus, which uses a recursive syntax. Therefore, a computable function is one that can be described using recursion.

A total function is any function defined over all inputs of a given numerical set. In the case of the Ackermann function, that function is defined only over the whole numbers (non-negative integers and zero). While a modern programming language would normally rely on implementation details to restrict its inputs as needed, mathematically speaking, a function defined over all inputs of a well-defined set such as ℕ0 (the set of whole numbers) is considered total. By contrast, a partial function restricts the inputs to a subset of any well-defined set, such that some inputs have no corresponding output.

The Ackermann function is a total, computable function that is non-primitive recursive. Another way to say this is the Ackermann function is defined over the set {0, 1, 2, }, it can be computed using recursion, but it cannot be computed using iteration.

Definition of the Ackermann function

Today, motley modifications of the function Ackermann defined in his 1928 paper can be found bouncing around math websites, internet wikis, and college textbooks, all of which style themselves “Ackermann’s function”.

Dr. Robert Munafo has enumerated eleven separate Ackermann variations he’s found.

Ackermann’s paper defined the function, originally, as a computation over three variables. Today, the Ackermann function commonly referred to in textbooks is one that’s been reduced to two variables. Rózsa Péter is responsible for first showing this simplification down to only two variables in her 1935 paper Konstruktion nichtrekursiver Funktionen. Her innovation has suggested the name Ackermann-Péter function, which one can occasionally see used as well.

Rózsa Péter, (1905–1977). Public Domain.

A two-variable version of the Ackermann function is what the National Institute of Standards and Technology (NIST) adopted into its “Dictionary of Algorithms and Data Structures” under the entry Ackermann’s function in 1998.

This NIST definition was first described by neither Rósza Péter nor by Wilhelm Ackermann, but by Raphael Robinson, an American who built on their work in his 1948 paper “Recursion and Double Recursion”.

The Robinson definition that NIST accepted runs:

  • Ackermann(0, j) = j + 1 for j ≥ 0
  • Ackermann(i, 0) = Ackermann(i - 1, 1) for i > 0
  • Ackermann(i, j) = Ackermann(i - 1, Ackermann(i, j - 1)) for i, j > 0

In plain English:

If the first input is zero, then increment the second input and output that 
result.

If the first input is non-zero but the second input is zero, then output
the result of another Ackermann operation using the decrement of the first
input as the first input and 1 as the second input.

If both inputs are non-zero, then decrement the second input. Feed that
result into an Ackermann operation as its second input. The first input
should remain the same as this whole operation's first input. Now take the
result of that Ackermann operation and feed it as input into another
Ackermann operation, as the second input. The first input should be the
whole operation's first input, decremented. Output the result of that
second Ackermann operation.

Implementations

I first became aware of the Ackermann function from Harold Abelson and Gerald Sussman’s 1984 book The Structure and Interpreation of Computer Programs. They define the Ackermann function using the Scheme dialect of the Lisp programming language:

Here it’s obvious that that the Robinson/NIST definition is not followed, and instead, some variant is being referred to — exactly the same variant described by Kenneth Rosen in his 1988 textbook Discrete Mathematics and Its Applications.

Incidentally, this Abelson-Sussman/Rosen variant is not included among those eleven enumerated by Robert Munafo’s on his “Versions of Ackermann’s Function” page. (The NIST variant is there, under a subhead that states “Raphael M. Robinson simplified Peter’s version in 1948.”)

Haskell

Like Lisp, Haskell is a computer programming language that’s designed to be well-suited for recursive definitions.

A Haskell implementation of the (NIST) Ackermann function, using guard notation, could look as follows:

ackermann :: Integer -> Integer -> Integer
ackermann m n
| m == 0 = n + 1
| m > 0 && n == 0 = ackermann (m - 1) 1
| m > 0 && n > 0 = ackermann (m - 1) (ackermann m (n - 1))

Python

In Python, a language often better-suited for readability, one might implement NIST-Ackermann thus:

def ackermann(m: int, n: int) -> int:
if m == 0:
return n + 1
elif m > 0 and n == 0:
return ackermann(m - 1, 1)
elif m > 0 and n > 0:
return ackermann(m - 1, ackermann(m, n - 1))
else: # Correct inputs will never take us here
return -1

Experimenting with this function yourself will likely bring to mind another quality of Python: it is a language particularly ill-suited for recursion.

Executing this function for any value of m greater than 3 (except for the trivial case where n is 0) will probably result in the interpreter throwing you a RecursionError.

The Python documentation describes the RecursionError exception this way:

This exception is derived from RuntimeError. It is raised when the interpreter detects that the maximum recursion depth (see sys.getrecursionlimit()) is exceeded.

So there’s an interface in the sys library to check the depth of the stack, and prevent a crash. On my laptop, calling the getrecursionlimit function produces an integer value of 1000.

$ python3 -c 'import sys; print(f"Recursion limit: {sys.getrecursionlimit()}")'
Recursion limit: 1000

But what does this value mean? Documentation to the rescue again:

sys.getrecursionlimit()

Return the current value of the recursion limit, the maximum depth of the Python interpreter stack. This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python. It can be set by setrecursionlimit().

Can we make more progress on an Ackermann computaion by setting this value as high as possible? Let’s try:

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.7.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import sys

In [2]: a_big_int = 0xffffff

In [3]: sys.setrecursionlimit(a_big_int)

In [4]: def ackermann(m: int, n: int) -> int:
...: if m == 0:
...: return n + 1
...: elif m > 0 and n == 0:
...: return ackermann(m - 1, 1)
...: elif m > 0 and n > 0:
...: return ackermann(m - 1, ackermann(m, n - 1))
...: else:
...: return -1
...:

In [5]: ackermann(3, 1)
Out[5]: 13

In [6]: ackermann(4, 1)
Segmentation fault (core dumped)

By removing Python’s built-in guard rails, we have directly crashed the python interpreter by overflowing the stack.

I will save us some time and assert that dynamic programming (storing computed values in a table for later reuse) will not shrink the computation time enough to effectively help us either.

Dr. David Brailsford of Computerphile demonstrated a C implementation of NIST-Ackermann that had not finished computing ackermann(4,2) after running for four weeks straight.

Answer table

The results of the NIST-Ackermann function on small non-negative integers follow this pattern:

Overall, the Ackermann function has a time complexity of:

O(m * Ackermann(m, n))

It is possible to elucidate the equations representing different values of n when n is fixed:

\textrm{Ackermann} (0, m) \longrightarrow f(m) = m + 1
\textrm{Ackermann} (1, m) \longrightarrow f(m) = m + 2
\textrm{Ackermann} (2, m) \longrightarrow f(m) = 2m + 3
\textrm{Ackermann} (3, m) \longrightarrow f(m) = 2^{m + 2} — 3

To write the equation representing a fixed value of n as 4, we need some other notation, since the time complexity is greater than exponential.

Hyperoperations

The time complexity for Ackermann(n, m) advances to the successive hyperoperation as n increases.

In the row of the answer grid where n = 4, the result gets exponentiated-by-two one additional time for each increment of m.

For example, Ackermann(4, 0) is “negative 3 plus 2 to the power of 2 to the power 2”; that’s three 2’s. Whereas Ackermann(4, 1) is “negative 3 plus 2 to the power of 2 to the power of 2 to the power 2”; that’s four 2’s. The pattern continues this way.

We can represent row 4 (and its equation) more succinctly using Knuth up-arrow notation. Donald Knuth introduced this method for representing large numbers in his 1976 paper Mathematics and Computer Science: Coping with Finiteness.

In this notation, a single up-arrow (↑) represents exponentiation; a double up-arrow (↑↑, abbreviated ↑²) represents tetration — exponentiation done a certain number of times; a triple up-arrow (↑↑↑, abbreviated ↑³) represents pentation, which is tetration done a certain number of time. Four up-arrows represents hexation; five up-arrows represents heptation, and so on.

Armed with this descriptive tool, we can reformulate Ackermann for n = 3, as follows:

\textrm{Ackermann} (3, m) \longrightarrow f(m) = 2\uparrow(m + 3) — 3

Fixing n as 4 then produces the equation:

With this notation, we can also extend our answer table further.

If a pattern appears emergent here, then another way to rewrite NIST’s Ackermann function using hyperoperational notation might be coming into focus.

When n > 2, it is possible to say:

This definition isn’t sufficient itself. The same base cases for the Ackermann inputs at low values are still necessary. We also force ourselves to observe that this definition does not liberate us from the tangles of recursion: the definition of the up-arrow is itself recursive.

New description of the Ackermann function

In summary, we could fully describe the NIST Ackermann function (the Ackermann-Péter-Robinson function) as follows:

Conclusion

A student of David Hilbert, a soldier, and a lifelong secondary school instructor — Wilhelm Ackermann described a function that has provoked thoughtful study from the fields of mathematical computation and computer science.

The Ackermann function is not one single function, but, at this point, a family of functions, all of which exhibit double-recursion and an extreme growth rate.

The eponymous NIST standard algorithm owes gratitude to the collective efforts of Wilhelm Ackermann, Rózsa Péter, and Raphael Robinson.

For many, the Ackermann function may serve as an entry point into further study of very large numbers. For others, its implementation provides a fascinating exercise in recursive computer programming.

Resources

--

--

Matthew A. Pagan
The Modern Scientist

Artist/programmer writing about technology's collisions with history, literature & philosophy.