A few fundamental algorithms are considered important for developers and all tech roles.
Software developers care much more about your ability to create a new algorithm than memorizing an existing one. Still, these come up frequently enough that it’s worth your time to remember them.
The two most common good ways to sort an array are Quick Sort and Merge Sort. The others — Bubble Sort, Insertion Sort, Radix Sort, etc. — are less efficient in general or only work with specific assumptions.
- Merge Sort operates by sorting the left and right half of the array and merging the arrays. How does it sort the left and right halves? Through the Merge Sort algorithm (recursively). It takes the left half, divides that in half, sorts each part, and then merges those. It then does the same on the right half. Merge Sort is O(n log(n)) in the average and worst cases.
- Quick Sort sorts data by choosing a random “pivot” element and rearranging the elements in the array based on whether they’re less than or greater than the pivot. Next, it tackles the elements on the left side of the pivot (all of which are less than or equal to the pivot) and the right half of the pivot (all of which are greater than the pivot). It applies the same strategy to each side: pick a pivot, rearrange, and then pick a new pivot on each side. Quick Sort is O(n log(n)) in the average case, but O(n2) in the worst case. The worst-case will happen if a bad “pivot” (a very low or very high element) is continuously picked.
Note that both algorithms have an approach of “divide into two parts and then re-apply algorithm.”
The other sorting algorithms are the naive implementations you might do in real life when you’re trying to sort a stack of papers.
- Insertion Sort maintains a sorted sublist of elements (initially 0) at the beginning of the array. It then looks at the beginning of the unsorted sublist. If this element is bigger than the last element in the sorted sublist, it leaves it in place and just grows the sorted portion (since the element is already in the correct order). If it’s smaller, it moves it into place in the sorted sublist. The unsorted portion shrinks by one each time. The algorithm then repeats this step for each element in the unsorted portion until the array is fully sorted. Insertion Sort takes O(N) time in the best case (if the array is already sorted), but O(N2) in the expected and worst case.
- Bubble Sort is a pretty straightforward algorithm. It iterates through the list repeatedly, swapping each pair of elements out of order. Once a full iteration happens without any swaps, the array is sorted. This takes O(N) in the best case (if the array is sorted) and O(N2) in the expected and worst case.
It’s unlikely you’ll be asked to implement one of these algorithms by name, but it’s still useful to understand how one might sort data.
Problem 1 — Given two sorted arrays, write a function to merge them in sorted order into a new array.
The most efficient way to tackle this is to use the fact that the arrays are sorted. We can merge them by taking successive elements repeatedly until we reach the end of both arrays. We maintain pointers to where we are in each array just easily to move onto the next array.
Let’s take the example of two arrays:
A: 1 5 8 9, B: 2 4 9 10 12
We’ll start with the p and q pointers at the beginning of the two arrays:
A: 1 5 8 9, B: 2 4 9 10 12
A[p] is smaller than A[q], so we put A[p] into our result array. We then move p to the next value.
A: 1 5 8 9, B: 2 4 9 10 12
We again compare A[p] and A[q], putting the smaller element into the resulting array. We also need to keep track of where we are in the result array. We repeat this process until we are done with both arrays.
In an interview, it’s useful to walk through the example in this detail to reduce the number of mistakes you make.
This code takes O(M+N) time, where M is the length of the first array and N is the length of the second.
Problem 2 — Implement insertion sort.
Insertion sort operates by iterating through the array and inserting each element into the elements to its left.
We can most cleanly implement this as two different functions. The first function performs the overall algorithm: pick up an element, insert it in order, pick up the next one, and so on.
Observe that our for loop starts at 1 instead of 0. This is because the 0th element can never be out of order by itself.
Now we just need to implement a method that will take an element A[k] and insert it in order into the elements to the left of it (provided those are sorted). To insert A[k] in order, we will need to shift each element over by one until we find the right spot for the element.
This algorithm will take O(N2) time.
Binary search is an algorithm for locating a value in a sorted list (typically an array). In binary search, we compare the value to the midpoint of our list. Since our list is sorted, we can determine whether the value should be located on this comparison element’s left or right sides. We then search the left or right side, repeating this operation: compare to the midpoint of the sublist, and then search the left or right half of that sublist.
Because we’re repeatedly dividing the data set in half, the algorithm takes O(log n) time in the average and worst case.
We often perform binary searches in real life without realizing it. Imagine you had a stack of student exams sorted by the first name. If you had a name like “Peter,” would you search starting from the top of the stack? Probably not. You would hop about halfway through and then compare. If you see “Mary,” you know to keep going. You could then just search that second half of exams by continuously dividing the stack in half.
Binary search is a popular algorithm and, therefore, an important concept to understand. Many algorithms are based on binary search.
Problem 3— Implement binary search. Given a sorted array of integers and a value, find the location of that value.
Binary search works by repeatedly “halving” the array into subarrays. In the first iteration, we compare the value x to the midpoint and learn whether x will be left or right. Then, we repeat this step with this new subarray: is x found on the left half of the new subarray or the right?
We can implement this either recursively or iteratively (non-recursively). We’ll start with the recursive solution since it’s more intuitive for most people.
For the iterative solution, we take a very similar approach.
The time complexity of both the above solutions would be O(log N).
Problem 4— You are given an integer array sorted but then rotated. It contains all distinct elements. Find the minimum value. For example, the array might be 6, 8, 9, 11, 15, 20, 3, 4, 5. The minimum value would be 3.
A brute force solution would be to just iterate through the array and look for the minimum value. We can guess that this isn’t what the interviewer is looking for, though, since it doesn’t use the sorting information.
To develop a more optimal solution, we probably want to use the information we’re given — the array is “sorted” but rotated. Since the array is somewhat sorted, let’s think about applying some of the concepts from the binary search. Binary search works by looking at the midpoint repeatedly. In this problem, what does the midpoint tell us? In and of itself, the midpoint being 15 doesn’t tell us anything. However, if we know that the left side is 6 and the right side is 5, we can conclude something. Since left > right, we know that the array is out of order. But since left < middle, we know the left is in order, but the right is not.
6, _, _, _, 15, _, _, _, 5
We can determine from examining the above array that the inflection point (which is the minimum element) is on the right half. Our problem is now divided in half. To find the minimum element, we now just recurse.
20, _, _, 5
We can implement this recursively. We stop when we find that the left side is less than the right. This indicates that this portion of the array is in order, and therefore that the left is the smallest element.
Alternatively, we can implement this algorithm iteratively with a while loop.
Be very careful in problems like this with your termination and recursion conditions. Think about why you make left = middle + 1 (why the +1?) but right = middle. Those are easy places to make mistakes.
The time complexity of both the above solutions would be O(log N).
There are two common algorithms for searching a graph: depth-first search and breadth-first search.
- Depth First Search
In a depth first search, we will completely search a node’s first child before going on to the second child, third child, and so on. For example, imagine a node with two children, A and B. If we search for a value v, we search A (and the nodes connected to A) before checking out B. It’s called “depth-first search” for this reason; we go deep before we go wide.
2. Breadth First Search
In Depth First Search, we go wide before deep. If we start from an initial node R, we first check R, and all the nodes immediately connected to R (let’s call these “children”). Then, we expand our search outwards, searching all the nodes connected to R’s children. We repeat this process until we find the value or have completed searching this entire [sub-]graph.
We need to be careful not to wind up going in circles forever in both algorithms. Therefore, if there are cycles in the graph — that is, if there is more than one path to get from one node to another, we need to mark the nodes as “already visited” to ensure that we don’t repeatedly search the same node. This will not be an issue for trees, as there are no cycles in a tree.
Note that a graph can have two completely separate parts that are not connected. If this is the case, we need to perform our search algorithm on each component to ensure that we find the item we’re looking for.
Problem 5— Using depth-first search, check if a tree contains a value.
Depth-first search works by checking if a value v equals the current node’s value. If it is not, you search each child of the node, one by one.
The difference between depth-first search (DFS) and breadth-first search (BFS) is that in DFS, the entire subtree of a node’s child is searched before you move on to any of the node’s other children. That is, all of the node.child.children will be searched before you even look at the node.child.
We can implement this recursively.
Because this is a tree, we don’t need to be concerned about infinite loops. That is, we don’t need to be concerned about traversing through node’s children and node’s “grandchildren” and accidentally winding up back at node — to be forever stuck in an infinite loop (yikes!). Trees specifically forbid cycles like this.
If this weren’t the case — if we were in a graph instead of a tree — we would have to use an isVisited flag to indicate that we’ve already visited this node.
Problem 6— Write the pseudocode for breadth-first search on a binary tree.
We want to search a node level by level to perform a breadth-first search. We want to search each node’s children before we search any of their children.
As you might recall, a queue is a data structure that allows us to add items on one side and remove them from the other side. This enables us to flag nodes “as to be processed later.”
In BFS, we “visit” a node by comparing the value we’re searching for (x) to the current value. If it matches, we’re done and can immediately return true. Otherwise, we add the node’s children to the end of the queue. We then move on, pulling a node from the other side and searching it.
Because this is a tree, we do not need to worry about winding up in a cycle. However, if this were not the case, we would need to use an isVisited flag to avoid revisiting the same node.
The breadth-first search takes O(N) time, where N is the number of nodes in the graph (or tree). This is because we might potentially need to search all of the nodes.