Finding the square root of square numbers recursively using the Binary Search algorithm in the programming language Ruby
First of all, let’s make it clear what square root of a square number is.
In mathematics a square root of a number x is a number y such that y² = x; in other words, a number y whose square (the result of multiplying the number by itself, or y × y) is x. — Wikipedia
For 9 for example, it could be either 3 or -3 because of -3 × -3 = 9 and 3 × 3 = 9. We’re only going to look positive square numbers.
Now let us try to understand, what exactly the binary search algorithm is and how could we use it to find a square root of a given square number.
In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array. — Wikipedia
For our case, the sorted array would be from 1 to whatever the initially given number is and the target will always be there if the given input is a square number. So lets first go through the process of finding the square root of number 9 to get a better understanding of how the algorithm works.
This is the initial sorted array and in arrays, we have to take into consideration, that indexes start from 0. We have numbers from 1 to 9, but indexes from 0 to 8. So at index 0 in the array is number 1, at index 1 number 2, and so on.
At first, we look the endpoints 1 and 9 marked with red, we add those array member indexes together and find the midpoint by dividing the sum with 2, 0 + 8 = 8 => 8 / 2 = 4. Array member at index 4 is number 5, which is marked with green. So then we find the square of that number and compare it with the initial number to see if that is the number we’re looking for 5 × 5 = 25 => 25 != 9. As it seems, it is not the right number so we move on.
As we saw, the square of 5 is 25 and it’s greater than 9 and we know that on the right side of 5 in the array are only greater numbers than 5, whose squares are even greater, so any of those can’t be the number, we’re looking for. So our newly targeted area will stay left from the 5. The starting point in the array will stay the number 1 at the index 0 again and the endpoint will be at index 4 – 1 = 3. Then we will find the midpoint again 3 / 2 = 1.5 => 1. Whenever we get a decimal point, we floor it. That means our newly found target number is in the array at index 1, which is number 2 and it’s marked with green. We find the square of that number, in order to see, if it’s the number, we’re looking for 2 × 2 = 4 => 4 != 9. As it seems again, it’s not the right number and yet again we move on.
This time the difference is that number 2 square 4 is smaller than 9, so our new targeted area will stay right from number 2. For the starting point in the array, we set the index to be greater by 1 of the number 2 index, which is 1 => 1 + 1 = 2, and for the endpoint, we keep the same index as previously.
Then we add starting and ending point indexes together again and divide by 2, to find the midpoint 2 + 3 = 5 => 5 / 2 = 2.5 => 2. In the array at index 2 is the number 3, then we check if that is the number we’re looking for 3 × 3 = 9 => 9 == 9. This is our number and now we can return that number as an answer.
Ok, next let us try to understand what recursion is because it’s vital to understanding our algorithm.
Recursion in computer science is a method of solving a problem where the solution depends on solutions to smaller instances of the same problem. Such problems can generally be solved by iteration, but this needs to identify and index the smaller instances at programming time. On the opposite, recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central of computer science. — Wikipedia
Well… that didn’t make it much clearer, let us try again with a simple example:
def recursion(call_stack_height, call_stack_depth = 0)
return_value = nil if call_stack_depth == call_stack_height
return_value = "Solution to our problem!"
else
return_value = recursion(call_stack_height, call_stack_depth + 1)
end puts "Call stack depth: #{call_stack_depth} - return value: #{return_value}" return_value
end
That function doesn’t do anything particularly important, but it explains the basic meaning behind recursion. Whatever number we give in, that decides how deep the call stack will be. So for instance for number 4, the output for the function would be:
# Function call:
recursion(4)# Output
# Call stack depth: 4 - return value: Solution to our problem!
# Call stack depth: 3 - return value: Solution to our problem!
# Call stack depth: 2 - return value: Solution to our problem!
# Call stack depth: 1 - return value: Solution to our problem!
# Call stack depth: 0 - return value: Solution to our problem!
So first we pass in number 4 as the call stack height and we assign number 0 to the call stack depth in the initial function call as the default value and we just call out the function on itself every time adding number 1 to the call stack depth, while we’re going deeper and deeper into the rabbit hole. So first the function calls out itself as many times as needed, to find the solution to the problem, and then it starts to resolve itself from the top of the call stack to bottom again.
As right now you may be wondering, what is that call stack, what is been used over and over again in the previous example?
In computer science, a call stack is a stack data structure that stores information subroutines of a computer program. This kind stack is also known as an execution stack, program stack, control stack, run-time stack, or machine stack, and is often shortened to just ‘the stack’. Although maintenance of the call stack is important for the proper functioning of most software, the details are normally hidden and automatic in high-level programming languages. Many computer instruction sets provide special instructions for manipulating stacks
A call stack is used for several related purposes, but the main reason for having it is to keep track of the point to which each active subroutine should return control when it finishes executing. An active subroutine is one that has been called but is yet to complete execution, after which control should be handed back to the point of call. Such activations of subroutines may be nested to any level (recursive as a special case), hence the stack structure, For example, if a subroutine DrawSquare calls a subroutine DrawLine from four different places DrawLine must know where to return when its execution completes. To accomplish this, the address following the instruction that jumps to DrawLine, the return address, is pushed onto the top of the call stack with each call. — Wikipedia
In our case, it’s the recursion function calls. In the moment of finding the value, there’s five separate function calls on top of each other in the call stack before these calls start to resolve themselves:
recursion(4, 4) # Fifth call
recursion(4, 3) # Fourth call
recursion(4, 2) # Third call
recursion(4, 1) # Second call
recursion(4, 0) # First call
Ok, finally, the algorithm itself:
def sqrt(number)
inputs = [1, number, nil, [
lambda { inputs[2] },
lambda {
inputs[-1] = inputs[2] - 1
sqrt_recursive(inputs)
},
lambda {
inputs[0] = inputs[2] + 1
sqrt_recursive(inputs)
} ], number]
sqrt_recursive(inputs)
end
First, the function sqrt(number) is called with the number, from which we intend to find the square root of. It populates the inputs array with all the necessary data we need in the recursive function sqrt_recursive(inputs) function calls. In Ruby, arrays are passed by reference, so in order to keep the memory usage as low as possible I’m using an array for all the data, I need to pass around. Let us walk through, what we exactly have there. As we know that our array of numbers, where we are going to do the search, is always from 1 to the initially given number, so then we actually don’t even need the array of those numbers itself, but only the starting and ending point and we just calculate our midpoint from those numbers and reset them accordingly. So first inputs array member will always be the starting point of the targeted area, which in the beginning is number 1. The second member of the array called number is the actual initial square number itself, which will stay unchanged throughout the execution of the algorithm. The third member of that array is the midpoint of the currently targeted area, but in the beginning, there ain’t one, so we initialize it with nil. Now the fourth part of that array is a little bit tricky one, it’s an array of itself filled with three lambda objects, which, in essence, are function pointers. But before explaining those functions, we need to know what the last number in the inputs array is and it’s the endpoint of the currently targeted area. So now the lambdas:
lambda { inputs[2] }
The first one is a function, which we call when the actual number we’re looking for has been found and it just returns the current midpoint as that number.
lambda {
inputs[-1] = inputs[2] - 1
sqrt_recursive(inputs)
}
The second one is a function, where we set the endpoint being smaller by number one of the current midpoint because the midpoint is greater than the actual number, we’re looking for and then we call sqrt_recursive(inputs) function again with modified inputs array. In Ruby, if we use negative indexes in an array, then it starts counting from the end, so inputs array at index -1 gives us access to the last member of that array.
lambda {
inputs[0] = inputs[2] + 1
sqrt_recursive
}
The third and the last one is a function where we set the new starting point to be greater by number 1 of the current midpoint because the current midpoint was smaller than the number we’re looking for and then again we call out the sqrt_recursive(inputs) function with modified inputs array. At the end of the sqrt(number) function, we make our first call for the sqrt_recursive(inputs) function. Let’s explain that one as well:
def sqrt_recursive(inputs)
inputs[2] = (inputs.first + inputs.last) >> 1
return inputs[3][inputs[2]**2 <=> inputs[1]].call
end
In the first line, we’re finding the midpoint of the currently targeted area. We’re finding the sum of the inputs array first and the last element, what was the targeted area starting and ending point respectively. Then we divide the sum by two using bitwise operation shifting all the bits to the left by one position. Why a bitwise operation? Wikipedia helps us here again:
In digital computer programming, a bitwise operation operates on one or more bit patterns or binary numerals at the level of their individual bits. It is a fast and simple action, directly supported by the processor, and is used to manipulate values for comparisons and calculations.
On simple low-cost processors, typically, bitwise operations are substantially faster than division, several times faster than multiplication, and sometimes significantly faster than addition. While modern processors usually perform addition and multiplication just as fast as bitwise operations due to their longer instruction pipelines and other architectural design choices, bitwise operations do commonly use less power because of the reduced use of resources. — Wikipedia
In the second line, we’re accessing the lambdas array from the inputs array and for deciding which lambda to call, we use Ruby’s spaceship operator, what is a comparator and it works that way:
a <=> bif a < b then return -1
if a == b then return 0
if a > b then return 1
So first we find the square of the current midpoint and use the spaceship operator to compare it with initial number inputs[2]**2 <=> inputs[1]. If the found number is smaller than the number we’re looking for, then we choose the lambda at index -1, which is the third and the last one. if the found number is equal to the number, we’re looking for, then we choose the first lambda at index 0, what is the answer. If the found number is greater than the number we’re looking for then we choose the middle and the second lambda at index 1. In order to actually call out the function, we have to attach the call method on the construct.
So now we’re only left with the last thing and this is the estimation of our algorithm efficiency, time and space complexity. In programming, time and space complexity is often expressed with the Big O notation, typically O(1), O(㏒ n), O(n), O(n ㏒ n), O(n2), etc. Let’s see, what the Big O notation is:
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value of infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Landau, and others collectively called Bachmann-Landau notation or asymptotic notation. — Wikipedia
The time complexity:
In computer science, big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows. In analytic number theory, big O notation is often used to express a bound on the difference between an arithmetical function and a better-understood approximation. — Wikipedia
Our algorithm, like a binary search algorithm, time complexity will be O(㏒ n), where the “n” would be the initial number, we’re providing as an input. Now space complexity will be the same, because the moment, we find the value, we’re looking for, we have the same amount of function calls in the stack. So in this particular moment using an old regular loop would give us actually O(1) space complexity. Usually, when using a loop is a possibility, it’s almost always a more efficient way to go but recursion is just sometimes a cool and elegant way to solve problems.