Understanding the Linear Search & Binary Search Algorithm

Mohammad Fakhruddin
7 min readMay 6, 2023

--

This picture not related to this articles but the word “Algorithm” came from his name, Al-Khawarizmi.

In this article we’ll learn about the two concept of very popular algorithm. We’ll look at what problems it can be applied to and use visuals to help understand the math behind the equations.

Linear Search Algorithm

First of all, the Linear Search Algorithm, also known as the Sequential Search Algorithm, is a loop that traverses each element of an array to find a matching value. Each item in an array is called an ‘element’. This algorithm is very common among software developers because it is simple to develop and straightforward.

Linear search traverse each element in array

Code below is the implementation of linear search algorithm in java.

public static int linearSearchAlgorithm(int[] Arr, int target) {
for (int i = 0; i < Arr.length; i++) {
if (Arr[i] == target) {
return i; // Return the index of the found element
}
}
return -1; // If the element is not found, return -1
}

In short, this approach is considered a ‘brute-force’ method. To give an analogy, imagine you want to open chapter 10 of a book, but instead of going to skip all of the 1–9 chapters, you flip through each page from chapter 1 until you reach chapter 10. This consumes a lot of time and is similar to the brute-force approach.

worst-case time complexity for linear search algorithm is O(n) Linear time where ‘n’ is size of the list. why it is called as “linear”? because its an algorithm whose performances will grow linearly based on the size of the input data set/array/collection. This algorithm is suitable if the list is unsorted.

Big-O Complexity Chart

the best-case time complexity for linear search is O(1) constant time IF ONLY the element found at the first index. Any algorithm should view on their worst-case side because best-case is depend on luck and gamble play. you can’t expect the value inside the array found on the first index all the time right ? “worst-case time complexity” can be considered as “risk assessment” in terms of implementation for software development perspective.

Risk assessment : “the act of identifying possible risks, calculating how likely they are to happen and estimating what effects they might have, especially in the context of a company taking responsibility for the safety of its employees or members of the public” — Oxford Dictionary

In the real world of data collection, there are millions, and in fact, billions of data points that need to be processed. According to rivery.io, the Twitter platform generates 500 million tweets every day. That’s a huge number, and that’s only one social media organization. There are multiple giant companies that handle billions of data points out there. So, using the right algorithm and understanding its worst-case complexity should be a priority if we want to develop long-term, high-performance tech products.

Binary Search Algorithm

In computer science, it is known as half-interval search, logarithmic search, or binary chop. According to computer scientist George T. Heineman in his book ‘Algorithms in a Nutshell’, he stated:

Binary Search delivers better performance than Linear Search because it starts with a collection whose elements are already sorted. Binary Search divides the sorted collection in half untill the sought-for item is found, or untill it is determined that the item does not exists in the collection.

every each of iteration , the algorithm will cut/skip half of the value since it will divided by 2. The gif below is a demonstration of binary search algorithm :

Code below is the implementation of binary search algorithm in java. as you can see you need to determine the left and the right value of the array first to get its middle.

public int BinarySearchAlgorithm(int[] arr, int target){

int left = 0;
int right = arr.length - 1;

while (left <= right) {
int mid = (left + right) / 2;

if (arr[mid] == target) {
return mid;
} else if (arr[mid] < target) {
left = mid + 1;
} else {
right = mid - 1;
}
}

return -1;
}

table below shows the result of the comparison of maximum attempt between linear search & binary search to traverse the element in array :

Binary search is the winner in terms of performance on large size of list.

The worst-case time complexity of Binary Search Algorithm is O(Log n), which is logarithmic time, where ’n’ is the size of the list. If you refer to the Big-O Complexity Chart above, you can see that O(Log N) is better than O(N). So, what does ’n’ represent in O(Log N)? It was very confusing to me because O(N) is very straightforward, but we need to understand in logarithmic function, there are 2 number should be mentioning there, base and argument.

logarithmic equation example

I thought it was a base, but it’s not. for O(Log n) , ’n’ is size of the list, same like others. There are two perspectives you need to understand

  1. logarithm function in add math, n is called as input/argument.
  2. in Computer Science specifically topic in Data structure & Algorithm (Big-O topics), n is size of list/array.

So where is the base? Actually there is. it is hidden!

x is base and n is the argument.

for binary search algorithm case, it’s use base-2 since every iteration will be divided by 2. so, x = 2.

why in Big-O notation doesn’t put unknown or mention the base in its Big-O logarithmic function? There is a reason for this.

According to Theoretical Computer Scientist, CS PhD, Daniel R. Page said

- it is very normal for computer scientists to just write O(logn) without regard to the base, this is because it is true here that the base does not matter (and adding base 2 would just be there to add clarity to the analysis). This is a true statement, regardless of base. Often the algorithms that happen to have time complexities like this usually have the base 2, but not always (as below).

- Whenever you see a logarithm in a CS course with no explicit base, you should make sure your professor/instructor actually means base 2 or base 10 (as they will usually continue to use it in this fashion). Generally I just write the base out whenever I can to avoid ambiguity. Often times in analysis it does matter pedagogically for students (at a fine-grained level), other times it really doesn’t. I’ll give some examples of both situations:

so basically, they will never mention the base. you should assume if you study in computer science class, its a base-2 (or base-10 for those study calculus). what a weird right? ;)

Conclusion

In short, linear search is a simple algorithm that checks each element in a list until a match is found. Binary search is more efficient and relies on sorted input data, dividing the search interval in half at each step. While binary search is faster, linear search may be the only option for unsorted data or very small datasets. Knowing the differences between these algorithms is crucial for developing efficient search algorithms.

Understanding the differences between these two algorithms and their respective strengths and weaknesses is essential for developing efficient and effective search algorithms.

References :

--

--

Mohammad Fakhruddin

Computer Science & Software Engineering related articles, interest about System design, Algorithm to improve performance.