Search Algorithms

Natasha Ferguson
7 min readNov 3, 2022

--

A search algorithm is an algorithm that looks for data in a data set. Two common examples of search algorithms are linear and binary search. Here, we will learn how to implement a search on a list of integers using both search algorithms. We’ll start with a conceptual overview of algorithms steps and then will code out the solution in Python.

First off, what is linear search and what is binary search and how do they differ? When do you use one kind of search over the other? And how do these algorithms differ in terms of time and space complexity? Let’s answer these questions first.

Linear Search

Linear search is a search algorithm that starts from the beginning of a list and checks each element until the search key is found or the end of the list is reached. Linear search will compare all elements if the search key is not present.

Linear Search Time Complexity

If each comparison takes 1 µs (1 microsecond), a linear search algorithm’s runtime is up to 1 s to search a list with 1,000,000 elements, 10 s for 10,000,000 elements, and so on. Ex: Searching Amazon’s online store, which has more than 200 million items, could require more than 3 minutes. This algorithm typically uses a number of steps proportional to the size of the input.

For a list with 32 elements, linear search requires at most 32 comparisons: 1 comparison if the search key is found at index 0, 2 if found at index 1, and so on, up to 32 comparisons if the search key is not found. For a list with N elements, linear search thus requires at most N comparisons.

Linear Search vs. Binary Search

A linear search may require searching all list elements, which can lead to long runtimes. Let’s say we want to find a contact on a smartphone. Searching one by one from first to last can be time-consuming. Because a contact list is sorted, a faster search, known as binary search, checks the middle contact first. If the desired contact comes alphabetically before the middle contact, binary search will then search the first half and otherwise the last half. Each step reduces the contacts that need to be searched by half.

Binary search is a faster algorithm for searching a list under two conditions: (1) if the list’s elements are sorted, and (2) they are directly accessible (such as in an array).

Binary Search Time Complexity

Binary search is very efficient at finding an item within a sorted list. During each iteration of the algorithm, binary search reduces the search space (i.e., the remaining elements to search within) by half. The search terminates when the element is found, or the search space is empty (element not found).

For example, for a 32-element list, if the search key is not found, the search space is halved to have 16 elements, then 8, 4, 2, 1, and finally none, requiring only 6 steps.

Let’s revisit our Amazon example now. Searching Amazon’s online store, which has more than 200 million items, requires less than 28 µs; up to 7,000,000 times faster than linear search. 3 minutes in the case of using linear search versus 28 microseconds using binary search. Of course, I must note that comparing the actual time in minutes/microseconds is not the best approach when it comes to complexity analysis. We want to use an objective measurement of how much time it takes for both algorithms to run and we want it to be hardware-independent. With this example, all I’m trying to point out is a significant difference between the performance of linear and binary search algorithms.

The binary search algorithm also finds the location of a key value in a list but is much faster than linear search because the search is performed on a sorted list. Binary search compares the middle element of the list to the key. If the middle element matches the key, then the algorithm is complete and returns the index. If the key is smaller than the middle element, the loop proceeds by searching the first half of the list. If the key is larger than the middle element, the loop proceeds by searching the last half of the list. In this way the search field is always cut in half, leading to many fewer comparisons.

To calculate the index in the middle of the left and right pointers, the sum of the right and left is divided by 2. If the result is not an integer, the value is rounded down to a whole number using floor division. Python has a special floor division operator, //, that automatically drops the quotient’s decimal portion, so this operator is used when calculating the mid pointer index: middle = (left + right) // 2.

Binary Search Algorithm Steps

Let’s break down a binary search algorithm step by step:

List of sorted user IDs:

user_ids = [4, 6, 10, 13, 18, 22, 30, 35, 50, 67]
L M R

The key we are searching for: target = 50

left_pointer index = 0

right_pointer index = 9

mid_pointer = left pointer + right pointer // 2 = index 4.5 is rouned down to 4

We compare our target number to the mid_pointer value

Is mid_pointer = target? #No

Is mid_pointer < target? #No

Is mid_pointer > target? #Yes

We move the right_pointer to a location of mid_pointer — 1 and in doing so we are reducing the size of the search area

Next, we recalculate the position of the mid_pointer

mid_pointer left pointer(0) + right pointer(3) // 2 = index 1.5 is rounded down to 1

Our pointers look like this now:

user_ids = [4, 6, 10, 13, 18, 22, 30, 35, 50, 67]
L M R

Is mid_pointer = target? #No

mid_pointer < target, so every item before the mid_pointer is also smaller than target since we are dealing with a sorted array

We move the left_pointer one position to the right of the mid_pointer (mid_pointer + 1)

We erase mid_pointer and now we are left with:

user_ids = [4, 6, 10, 13, 18, 22, 30, 35, 50, 67]
L R

We recalculate the position of the mid_pointer again:

mid_pointer left pointer(2) + right pointer(3) // 2 = index 2.5 is rounded down to 2

Is mid_pointer = target? #No

We move the left_pointer one position to the right of the mid_pointer (mid_pointer + 1)

user_ids = [4, 6, 10, 13, 18, 22, 30, 35, 50, 67]
LR

We recalculate the position of the mid_pointer again:

mid_pointer left pointer(3) + right pointer(3) // 2 = index 3

Is mid_pointer = target? Yes! We are now done, and we return the index value we are looking for.

The time complexity of this algorithm is O(log n) because we are eliminating half of the elements at every point.

Recursive Implementation of The Binary Search

O(log(n)) time | O(log(n)) space (because we are adding frames to the call stack with recursion)

user_ids = [4, 6, 10, 13, 18, 22, 30, 35, 50, 67]
target = 50
def binary_search(array, target):
return binary_search_helper(array, target, 0, len(array) -1)
def binary_search_helper(array, target, left, right):
#base case
if left > right:
return -1
middle = (left + right) // 2
mid_val = array[middle]
if target == mid_val:
return middle
elif target < mid_val:
return binary_search_helper(array, target, left, middle - 1)
else:
return binary_search_helper(array, target, middle + 1, right)

binary_search(user_ids, target)

Learn more about recursion here.

Iterative Implementation of the Binary Search

O(log(n)) time | O(1) space

def binary_search(array, target, left, right):
while left <= right:
middle = (left + right) // 2
mid_val = array[middle]
if target == mid_val:
return middle
elif target < mid_val:
right = middle - 1
else:
left = middle + 1
return -1
binary_search(user_ids, target, 0, len(user_ids) - 1)

When to Use a Linear Search

A linear search’s time complexity is O(n). In the worst-case scenario, in a list of 20 items, your algorithm will take 20 steps. The best-case scenario is O(1) because the item you are looking for could be the first item. Consider using a linear search when your data is unsorted. In your data is sorted you can use a more efficient binary search. When you are programming in the real world, instead of writing your own linear search, you can use Python’s built-in keyword in. Example: print(18 in user_ids) #True

When to Use a Binary Search

A binary search takes O(log(n)) time. As we now know it’s more efficient than a linear search algorithm because you don’t have to search the entire list. A binary search’s efficiency makes a huge difference when you are dealing with a large amount of data e.g. a list of million numbers. If you performed a linear search, it might take you a million steps to complete your search. With a logarithmic binary search, it would take just 20 steps. To write binary search in Python you can use a built-in tool like bisect_left from the bisect module, which finds the index of an existing element in a sorted list using binary search.

--

--