Notes on recursion
This blog aims to simplify and visualize recursion concepts and its working
Recursion is a programming concept that solves a complex problem by breaking it down into smaller sub problems.
A function that calls itself when it runs is a recursive function. For example, in Linux systems we use the command “ls -R” which tells the system to display all contents of the directory recursively or go into each folder and in turn subfloders and print the contents.
Stack overflow problem
The main drawback of recursive functions is the stack overflow problem.
When a Python program runs, the variables (address in the stack the variable points to or the reference) are loaded into the stack memory alongwith the return addresses of the function calls. The stack memory is however static, which means we can make a limited number of function calls. Also, stack is temporary since once the Python interpreter has executed the function call and returns to the next line the address is popped out of the stack. This blog explains memory management in Python in more details.
Since recursive functions call themselves, the return address of the next line of code gets stored in the stack the first time the function calls itself. Subsequently as the function keeps calling itself, the return addresses get stored in the stack and this can result in a stack overflow when the stack has no more space to store another address due to exhaustion of the stack memory.
Iteration, Recursion, Dynamic Programming
Coding a recursive solution involves the following :
- Identify base case : A base case refers to a stopping condition for the code so that once the base case is satisfied, the Python interpreter pops out the return addresses
2. Identify recursive case : The cases when the function needs to call itself or a recurrence relation that will call the function to solve a smaller similar sub problem than the original complex problem (divide and conquer approach)
3. Get closer and closer to the base case : As we go through recursive cases, at some point we should reach base case
def fibonacci(n):
if n <= 1: return n # base case
return fibonacci(n-1) + fibonacci(n-2) # recursive case
The function fibonacci is a recursive function where the first line is the base case; once we calculate fibonacci(1) the Python interpreter returns back n or 1 for this case. For any n > 1, the recursive case is satisfied and fibonacci gets called.
Recursion vs iteration :
While recursion keeps the code simple following the DRY (Don’t Reapeat Yourself) principle it also has a large memory footprint (a large callstack); so iteration might be useful for cases when we know how many iterations we want to go through.
Recursion is useful when we dont know the number of iterations and deal with data structures like trees, and linked lists. The process of traversing all the nodes of a binary tree is very simple using depth first search.
Let’s analyse the stack and the time taken for recursion for coding the Fibonacci series.
import inspect
import timenum_calls_list = []def fibonacci(n):
num_calls = 0
if n <= 1: return n
num_calls_list.append(len(inspect.stack())) # Return a list of the frames pushed onto stack for executing the current line of code
return fibonacci(n-1) + fibonacci(n-2)
We use inspect.stack()
to view the Python call stack. We get back a named tuple as such
FrameInfo(frame, filename, lineno, function, code_context, index)
num_calls_list stores the len(inspect.stack())
at each recursive call. For n = 2 we get len(num_calls_list), num_calls_list
:
1 [29]
- 1 call to the function recursion, and 29 represents the number of stack calls from the function recursion till the final stack calls to the libraries used (since this is run on Anaconda we get the Anaconda frames added onto our stack too).
Printing the first element of the list returned by inspect.stack()
FrameInfo(frame=<frame at 0x0000027E131DECF0, file '<ipython-input-12-89cd151ff95a>', line 6, code fibonacci>, filename='<ipython-input-12-89cd151ff95a>', lineno=5, function='fibonacci', code_context=[' for ind, i in enumerate(inspect.stack()):\n'], index=0)
The first field frame
has the frame information such as the address, filename that contains it,calling function <frame at 0x0000027E131DECF0, file ‘<ipython-input-12–89cd151ff95a>’, line 6, code fibonacci>
; the filename
has the filename from where the function was called ‘<ipython-input-12–89cd151ff95a>'
; the line called from in lineno
; the function that is called ‘fibonacci’
in function
; and the code context
gives us the list of code lines at which this frame was inspected ; and index
represents the index of the code context in the list.
For n = 3:
2 [29, 30]
- For n = 3 we have :
1# fib(3) = fib(2) + fib(1) => fib(2) + 1
2# fib(2) = fib(1) + fib(0) => 1 + 0
- 2 function calls where the second function call has 30 addresses stacked in the stack space (1 extra than the first owing to the recursive function call for calculating the fibonacci value of 2)
Similarly, for n = 5:
7 [29, 30, 31, 32, 31, 30, 31]
For n = 8:
33 [29, 30, 31, 32, 33, 34, 35, 34, 33, 34, 32, 33, 34, 33, 31, 32, 33, 34, 33, 32, 33, 30, 31, 32, 33, 34, 33, 32, 33, 31, 32, 33, 32]
For n = 15:
986 [29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 41, 40, 41, 39, 40, 41, 40, 38, 39, 40, 41, 40, 39, 40, 37, 38, 39, 40, 41, 40, 39, 40, 38, 39, 40, 39, 36, 37, 38, 39, 40, 41, 40, 39, 40, 38, 39, 40, 39, 37, 38, 39, 40, 39, 38, 39, 35, 36, 37, 38, 39, 40, 41, 40, 39, 40, 38, 39, 40, 39, 37, 38, 39, 40, 39, 38, 39, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 34, 35, 36, 37, 38, 39, 40, 41, 40, 39, 40, 38, 39, 40, 39, 37, 38, 39, 40, 39, 38, 39, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 33, 34, 35, 36, 37, 38, 39, 40, 41, 40, 39, 40, 38, 39, 40, 39, 37, 38, 39, 40, 39, 38, 39, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 40, 39, 40, 38, 39, 40, 39, 37, 38, 39, 40, 39, 38, 39, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 33, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 40, 39, 40, 38, 39, 40, 39, 37, 38, 39, 40, 39, 38, 39, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 33, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 32, 33, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 33, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 34, 35, 36, 37, 38, 37, 36, 37, 35, 36, 37, 36, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 40, 39, 40, 38, 39, 40, 39, 37, 38, 39, 40, 39, 38, 39, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 33, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 32, 33, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 33, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 34, 35, 36, 37, 38, 37, 36, 37, 35, 36, 37, 36, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 39, 38, 39, 37, 38, 39, 38, 36, 37, 38, 39, 38, 37, 38, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 33, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 34, 35, 36, 37, 38, 37, 36, 37, 35, 36, 37, 36, 32, 33, 34, 35, 36, 37, 38, 39, 38, 37, 38, 36, 37, 38, 37, 35, 36, 37, 38, 37, 36, 37, 34, 35, 36, 37, 38, 37, 36, 37, 35, 36, 37, 36, 33, 34, 35, 36, 37, 38, 37, 36, 37, 35, 36, 37, 36, 34, 35, 36, 37, 36, 35, 36]
Summarizing..
We see an exponential rise in the time taken, total function calls, and the total stack space used as N increases. Hence, recursion deems to not be a very efficient solution if we know the input N could vary to a very large number.
An iterative solution on the other hand, would use no extra stack space and find the solution in linear time.
Dynamic programming would help save time by storing the already calculated fibonacci series of a number so that we don’t have to perform redundant calculations in each iteration.
The following approach is a more efficient solution for this problem instead of recursion.
def fibonacci(n):
fib, fib_list = 0 , [] # array to store the fibonacci values
for i in range(0,n):
if i <= 1 :
fib_list.append(i)
else:
fib_list.append(sum(fib_list))
fib_list = fib_list[1:] # Use a window to only store the nessecary fibonacci series values to calculate the next
return fib_list[-1]+fib_list[-2]
The time taken is nearly a constant as N increases. We can see for N= 150 also, we get almost the same time as N= 0.
Depth first search on binary trees
Coding a recursive folution for printing the preorder traversal of a binary tree (root -> left -> right) :
def get_node(node):
if node == None: # base case
return
num_calls_list.append(len(inspect.stack())) node_list.append(node.data) # Append to node_list
#print('left', node.data)
a = get_node(node.left) # Traverse left node ## Recursive case
# print('right', node.data)
b = get_node(node.right) # Traverse right node ## Recursive case
return
This solution is very simple to understand :
- Base case : Return to the calling function if current node is None
- Recursive case : If node is not None find the left node until None is returned by the base case; after that find the right node till None is returned.
We get the following output:
'1-> 2-> 5-> 4-> 6'
The original tree :
We consider two more trees, tree2 and tree3 of depth 4 and 8 :
Printing the time taken and the stack space we get:
Though the code is very simple to understand, we see a rise in the time taken and the stack space used as the depth of the tree increases.
On the other hand, if we use an iterative solution :
def print_node(linked_list):
node = linked_list.head
stack, preorder_traversal = [], [] # stack to store the previous node to pop ; preorder_traversal stores the order of nodes while True:
if node != None: # keep finding the left node and store the root node in stack
preorder_traversal.append(node.data)
stack.append(node)
# print('Appending ', node.data, 'to stack ..', stack)
node = node.left
elif stack: # Once left node returns null, we pop the stack and find the right nodes
node = stack.pop()
node = node.right
else:
break
return preorder_traversal #print("-> ".join(preorder_traversal))
We get a constant time solution though the code is not as easy to understand as the recursive solution.
Thanks for reading!
Also, if there’s any suggestion to improve this article, please let me know!
Here is the link to the code for reference.