Greedy Algorithm Explained using LeetCode Problems

Li Yin

Published in

Algorithms and Coding Interviews

14 min readOct 27, 2018

This article includes five sections:

What is Greedy Algorithm?
Guideline of Greedy Algorithm
LeetCode Examples
Exercises
Answers

Check my github repo for a lot more!

1. What is Greedy Algorithm ?

It is hard to define what greedy algorithm is. In my opinion, it is a very natural solution for problems that it can solve, and any usage of dynamic programming will end up to be “overkill”.

With dynamic programming, at each step we make a choice which usually dependents on and by comparing between the multiple solutions of the recurrence relation. Greedy algorithm is way easier than that! We find a rule, sort the items by some type of ordering — time, distance, size, or some type of ration, and we construct our optimal solutions incrementally w/o considering preceding items or choices incrementally and we end up having our optimal solution. If we have multiple optimal solutions, usually greedy algorithm will only give us one! In dynamic programming, we solve subprolems before making the first choice and usually processing in a bottom-up fashion; a greedy algorithm makes its first choice before solving any subprolems, which is usually in top-down fashion, reducing each given problem instance to a smaller one.

Due to the special relationship between greedy algorithm and the dynamic programming: ‘’beneath every greedy algorithm, there is almost always a more cumbersome dynamic programming solution”, we can try the following six steps to solve a problem which can be potentially solved by making greedy choice:

Divide the problem into subproblems, including one small problem and the remaining subproblem.
Determine the optimal substructure of the problems (formulating a recurrence function).
Show that if we make the greedy choice, then only one subproblem remains.
Validate the rightness of the greedy choice.
Write either a recursive or an iterative implementation.

2. Guideline of Greedy Algorithms

Pros:

Simplicity: Greedy algorithms are often easier to describe and code up than other algorithms.
Efficiency: Greedy algorithms can often be implemented more efficiently than other algorithms.

Cons:

Hard to design: Once you have found the right greedy approach, designing greedy algorithms can be easy. However, finding the right approach can be hard.
Hard to verify: Showing a greedy algorithm is correct often requires a nuanced argument.

Types of Greedy Algorithm

Activity-Selection: given a set of activities with start and end time (s, e), our task is to schedule maximum non-overlapping activities or remove minimum number of intervals to get maximum non-overlapping intervals. In cases that each interval or activity has a cost, we might need to retreat to dynamic programming to solve it.
Frog Jumping: usually it is array. This normally corresponds with the coordinate type of dynamic programming problems.
Data Compression
File Merging
Graph Algorithms, such as Minimum Spanning Trees algorithms (prim and Kruskal), Minimum path (Dijkstra). For weighted graph that has negative values, we have to use dynamic programming, such as Bellman Ford algorithms.

Two Types of Proof Methods

Exchange Arguments
Greedy Stays Ahead

Check out materials from Stanford.

3. LeetCode Examples

To identify a greedy problem: pay attention to the question they ask just as in Dynamic Programming.

True/False
Maximum/Minimum number

3.1 Activity-Selection

435. Non-overlapping Intervals(medium)

Given a collection of intervals, find the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.

Note:

You may assume the interval’s end point is always bigger than its start point.
Intervals like [1,2] and [2,3] have borders “touching” but they don’t overlap each other.

Example 1:

Input: [ [1,2], [2,3], [3,4], [1,3] ]Output: 1Explanation: [1,3] can be removed and the rest of intervals are non-overlapping.

Example 2:

Input: [ [1,2], [1,2], [1,2] ]Output: 2Explanation: You need to remove two [1,2] to make the rest of intervals non-overlapping.

Example 3:

Input: [ [1,2], [2,3] ]Output: 0Explanation: You don't need to remove any of the intervals since they're already non-overlapping.

Answer: Before we use the greedy algorithm, first let us see what is takes to do dynamic programming? If we first sort the intervals according to the start, then it is equivalent to find the Longest increasing subsequence, here the “increasing” means the start time of current interval needs to be larger or equal to the last interval’s end.

def eraseOverlapIntervals(self, intervals):
        """
        :type intervals: List[Interval]
        :rtype: int
        """
        if not intervals:
            return 0
        intervals.sort(key=lambda x: (x.start, x.end))
        max_count = 0
        LIS = [0]*(len(intervals)+1) # the LIS for array ends with index i
        for i in range(len(intervals)): # start with 10
            max_before = 0
            for j in range(i): 
                if intervals[i].start >= intervals[j].end:
                    max_before = max(max_before, LIS[j+1])
            LIS[i+1] = max_before+1
        return len(intervals)-max(LIS)

Now, use the greedy algorithm, which requires us to sort the intervals with the finish time. We start with empty set. Then we iterate through the list, if current interval overlaps with any previous interval.

class Solution:
    def eraseOverlapIntervals(self, intervals):
        """
        :type intervals: List[Interval]
        :rtype: int
        """
        if not intervals:
            return 0
        min_rmv = 0
        intervals.sort(key = lambda x: x.end)
        last_end = -sys.maxsize
        for i in intervals:
            if i.start < last_end: #overlap, delete this one, do not update the end
                min_rmv += 1
            else:
                last_end = i.end
        return min_rmv

630. Course Schedule III (hard)

There are n different online courses numbered from 1 to n. Each course has some duration(course length) t and closed on dth day. A course should be taken continuously for t days and must be finished before or on the dth day. You will start at the 1st day.

Given n online courses represented by pairs (t,d), your task is to find the maximal number of courses that can be taken.

Example:

Input: [[100, 200], [200, 1300], [1000, 1250], [2000, 3200]]
Output: 3
Explanation: 
There're totally 4 courses, but you can take 3 courses at most:
First, take the 1st course, it costs 100 days so you will finish it on the 100th day, and ready to take the next course on the 101st day.
Second, take the 3rd course, it costs 1000 days so you will finish it on the 1100th day, and ready to take the next course on the 1101st day. 
Third, take the 2nd course, it costs 200 days so you will finish it on the 1300th day. 
The 4th course cannot be taken now, since you will finish it on the 3300th day, which exceeds the closed date.

Note:

The integer 1 <= d, t, n <= 10,000.
You can’t take two courses simultaneously.

Answer: This problem is more complex than the normal activity selction problem. Because we are limited by the valid ending time. So our finish time needs to be smaller than that.

Considering how similar this problem is to the previous activity selection, we try to sort them by the deadline. For example, [[5,5],[4,6],[2,6]], after sorted it would be [[5, 5], [4, 6], [2, 6]]. However, if we choose [5,5], we only get 1, where the right answer is 2 to choose the later two results. So, what should we do instead?

If we sort them by d , it failed as we saw before. But it kind of making sense that first deadline goes first.
If we sort by t , [2, 6], [3,3], [4, 4], we can only get [2, 6], instead we have [3,3][2,6] or [4,4], [2, 6] as our optimal. But, still, it makes some kind of sense that the smaller the duration is, the better, because it leaves more resources for others.
If we sort by d-t , [[5, 5], [4, 6], [2, 6]] will still gives us wrong answer.

Wow, this is indeed difficult. The ordering between our optimal solution does not matter. What if I first construct some good enough solution by sorting with d , and we convert our problem to finding the maximum number of courses in range d if our start time is 0. At first, at at time 5, our best solution is [5,5]. When we are at time 6 when we are looping at [4, 6], we know replacing [5,5] with [4, 6] is better because it leaves more space to fit in other activities at least at this stage. Therefore, our solution is to we keep track of all selected activities, and assume we have i items in the selected activities with fi, now, with i+1th activity with d(i+1)

if it does not overlap, we push it in because this is the optimal for d(i+1),
if not, we find one that has the maximum time t, if ti≥ t(i+1), then replace ti with t(i+1) will result in earlier finish time, and longer deadline in all, leaving more space for the remaining activities to fit in.

|___________|ti__fi__di
Because the ordering between these i intervals does not matter, we assume i is the one has the largest ti. If we have ti > ti+1, f(i-1)+t(i+1)<f(i-1)+ti<di<di+1.

The code is as:

def scheduleCourse(self, courses: List[List[int]]) -> int:
        courses.sort(key=lambda x: x[1])
        ans = 0
        f = 0
        q = []
        for t, d in courses:
            if f + t <= d:
                heapq.heappush(q, -t) #use -t to get use it as maxheap
                f += t
            elif q:
                t_max = -q[0] 
                if t_max >= t: #replace
                    heapq.heappop(q)
                    heapq.heappush(q, -t)
                    f=f+t-t_max
                    
        return len(q)

Code-wise, it can be optimized as:

def scheduleCourse(self, courses: List[List[int]]) -> int:
        courses.sort(key=lambda x: x[1])
        #print(courses)
        ans = 0
        f = 0
        q = []
        for t, d in courses:
            heapq.heappush(q, -t)
            f += t
            if f > d:
                f += heapq.heappop(q)
        return len(q)

Also, to implement the priority queue first pop out the largest number we can put the negative value in, the code above can be rewritten into:

def scheduleCourse(self, A):
    pq = []
    start = 0
    for t, end in sorted(A, key = lambda (t, end): end):
        start += t
        heapq.heappush(pq, -t)
        while start > end:
            start += heapq.heappop(pq)
    return len(pq)

3.2 Jumping

55. Jump Game

Given an array of non-negative integers, you are initially positioned at the first index of the array.

Each element in the array represents your maximum jump length at that position.

Determine if you are able to reach the last index.

Example 1:

Input: [2,3,1,1,4]
Output: true
Explanation: Jump 1 step from index 0 to 1, then 3 steps to the last index.

Example 2:

Input: [3,2,1,0,4]
Output: false
Explanation: You will always arrive at index 3 no matter what. Its maximum
             jump length is 0, which makes it impossible to reach the last index.

Answer: the naive solution is we use a memo to represent if we can get pos i. We check each location, and make all the positions that it can get to true. This includes two embedded for loops, which gives out O(n²) time complexity and O(n) space complexity. This actually use the coordinate type dynamic programming. We would get LTE from LeetCode

The code is as follow:

class Solution(object):
    def canJump(self, nums):
        """
        :type nums: List[int]
        :rtype: bool
        """
        f = [False]*len(nums)
        if not nums:
            return True
        f[0] = True #start from first pos
        for i in range(len(nums)):
            if f[i] == False: # if we cant get i, then we dont use this step
                continue
            for j in range(nums[i]+1): #mark all the positions that we can get from current pos i
                if i+j < len(nums):
                    f[i+j] = True
        return f[-1]

How to be greedy? Previously in the dynamic programming, at each step, we need to consider multiple choices. Here we consider the greedy one: the right most position from current index can get. We use a max to track the right most position in the whole process.

class Solution(object):
    def canJump(self, nums):
        """
        :type nums: List[int]
        :rtype: bool
        """
        if not nums:
            return True
        maxReachableDistance = 0
        for i in range(len(nums)):
            if i > maxReachableDistance: #means we cant get to current position
                break
            maxReachableDistance = max(maxReachableDistance, i+nums[i])
        return maxReachableDistance>=len(nums)-1

45. Jump Game II

Given an array of non-negative integers, you are initially positioned at the first index of the array.

Each element in the array represents your maximum jump length at that position.

Your goal is to reach the last index in the minimum number of jumps.

Example:

Input: [2,3,1,1,4]
Output: 2
Explanation: The minimum number of jumps to reach the last index is 2.
    Jump 1 step from index 0 to 1, then 3 steps to the last index.

Note:

You can assume that you can always reach the last index.

Answer: we follow the last problem, the difference is we get the minimum number of jumps, which is still a typical dynamic programming problem. We use dp to record the minimum number of jumps to get index i. Because the greedy algorithm is always tricky, so going for the dynamic programming should be the first choice.

With dp, now let use look at this example,

dp[0]= 0
i =0, nums[0]=2:  dp[1] = min(dp[1], dp[0]+1) = 1;  dp[2] = min(dp[2], dp[0]+1) = 1
i = 1, nums[1]=3: dp[2] = min(dp[2], dp[1]+1) = 1;  dp[3] = min(dp[3], dp[1]+1) = 2; dp[4] = 2
....

By doing a simple example, we can get the relation before i and j: dp[i+j+1] = min(dp[i+j+1], dp[i]+1). The explanation can be: we track the the min number of jumps taken for every location we can get starts from i (that is i+j+1) by comparing the previous value dp[i+j+1] with dp[i]+1. The python code is :

class Solution:
    def jump(self, nums):
        """
        :type nums: List[int]
        :rtype: int
        """
        if not nums or len(nums) < 2:
            return 0
        dp = [sys.maxsize]*(len(nums)) # the minimum steps taken to get to index i
        dp[0] = 0
        for i in range(len(nums)):
            for j in range(nums[i]): #step j+1
                if i+j+1 < len(nums):
                    dp[i+1+j] = min(dp[i+1+j], dp[i]+1)           
        return dp[-1]

Unfortunately, the above dp solution will get us LTE error. Now, how to solve it greedily? Keep the current maximum reach distance, and the number of steps to reach this current maximum distances, and keep another variable to record the next maximum reachable distance, which cost the current steps plus 1. The key idea behind is that, all the positions before the maximum reachable distance would be able to be reached! Then we linear scan the array to keep updating the current maximum and the next maximum as well as the number of steps. We can achieve the linear time algorithm. The key idea behind the linear algorithm is that instead of keeping to know every position is reachable by how many steps, we only need to keep a single maximum reachable distances and the steps needed. So every time we only need to update this single value in constant time rather than update a linear portion of positions.

class Solution:
    def jump(self, nums):
        """
        :type nums: List[int]
        :rtype: int
        """
   
        if not nums or len(nums) < 2:
            return 0
        maxReachableDistance, maxNextAvailableDist, minSteps = 0, 0, 0
        for i in range(len(nums)-1):
            maxNextAvailableDist = max(maxNextAvailableDist, i+nums[i]) # track the max of next index
            if i == maxReachableDistance: # current index is the furthest we can get, have to do another step to get maxNext
                minSteps += 1
                maxReachableDistance = maxNextAvailableDist
        return minSteps

4. Exercise

860. Lemonade Change (Easy)

At a lemonade stand, each lemonade costs $5.

Customers are standing in a queue to buy from you, and order one at a time (in the order specified by bills).

Each customer will only buy one lemonade and pay with either a $5, $10, or $20 bill. You must provide the correct change to each customer, so that the net transaction is that the customer pays $5.

Note that you don’t have any change in hand at first.

Return true if and only if you can provide every customer with correct change.

Example 1:

Input: [5,5,5,10,20]
Output: true
Explanation: 
From the first 3 customers, we collect three $5 bills in order.
From the fourth customer, we collect a $10 bill and give back a $5.
From the fifth customer, we give a $10 bill and a $5 bill.
Since all customers got correct change, we output true.

Example 2:

Input: [5,5,10]
Output: true

Example 3:

Input: [10,10]
Output: false

Example 4:

Input: [5,5,10,10,20]
Output: false
Explanation: 
From the first two customers in order, we collect two $5 bills.
For the next two customers in order, we collect a $10 bill and give back a $5 bill.
For the last customer, we can't give change of $15 back because we only have two $10 bills.
Since not every customer received correct change, the answer is false.

Note:

0 <= bills.length <= 10000
bills[i] will be either 5, 10, or 20.

455. Assign Cookies (easy)

Assume you are an awesome parent and want to give your children some cookies. But, you should give each child at most one cookie. Each child i has a greed factor gi, which is the minimum size of a cookie that the child will be content with; and each cookie j has a size sj. If sj >= gi, we can assign the cookie j to the child i, and the child i will be content. Your goal is to maximize the number of your content children and output the maximum number.

Note:
You may assume the greed factor is always positive.
You cannot assign more than one cookie to one child.

Example 1:

Input: [1,2,3], [1,1]Output: 1Explanation: You have 3 children and 2 cookies. The greed factors of 3 children are 1, 2, 3. 
And even though you have 2 cookies, since their size is both 1, you could only make the child whose greed factor is 1 content.
You need to output 1.

Example 2:

Input: [1,2], [1,2,3]Output: 2Explanation: You have 2 children and 3 cookies. The greed factors of 2 children are 1, 2. 
You have 3 cookies and their sizes are big enough to gratify all of the children, 
You need to output 2.

874. Walking Robot Simulation

A robot on an infinite grid starts at point (0, 0) and faces north. The robot can receive one of three possible types of commands:

-2: turn left 90 degrees
-1: turn right 90 degrees
1 <= x <= 9: move forward x units

Some of the grid squares are obstacles.

The i-th obstacle is at grid point (obstacles[i][0], obstacles[i][1])

If the robot would try to move onto them, the robot stays on the previous grid square instead (but still continues following the rest of the route.)

Return the square of the maximum Euclidean distance that the robot will be from the origin.

Example 1:

Input: commands = [4,-1,3], obstacles = []
Output: 25
Explanation: robot will go to (3, 4)

Example 2:

Input: commands = [4,-1,4,-2,4], obstacles = [[2,4]]
Output: 65
Explanation: robot will be stuck at (1, 4) before turning left and going to (1, 8)

Note:

0 <= commands.length <= 10000
0 <= obstacles.length <= 10000
-30000 <= obstacle[i][0] <= 30000
-30000 <= obstacle[i][1] <= 30000
The answer is guaranteed to be less than 2 ^ 31.

861. Score After Flipping Matrix (Medium)

We have a two dimensional matrix A where each value is 0 or 1.

A move consists of choosing any row or column, and toggling each value in that row or column: changing all 0s to 1s, and all 1s to 0s.

After making any number of moves, every row of this matrix is interpreted as a binary number, and the score of the matrix is the sum of these numbers.

Return the highest possible score.

Example 1:

Input: [[0,0,1,1],[1,0,1,0],[1,1,0,0]]
Output: 39
Explanation:
Toggled to [[1,1,1,1],[1,0,0,1],[1,1,1,1]].
0b1111 + 0b1001 + 0b1111 = 15 + 9 + 15 = 39

Note:

1 <= A.length <= 20
1 <= A[0].length <= 20
A[i][j] is 0 or 1.

5. Answers

For the easy questions, we actually do not need to think too much about the algorithms or rules, we follow the instinct ^_^.

860. Lemonade Change (Easy)

class Solution:
    def lemonadeChange(self, bills):
        """
        :type bills: List[int]
        :rtype: bool
        """
        counters = {5:0, 10:0, 20:0}
        for bill in bills:
            counters[bill] += 1
            billChange = bill- 5 # charge 5, 0, 5, or 15
            if billChange == 15:
                if counters[10] > 0: #being greedy, because 5 is more useful than 10
                    counters[10] -= 1
                    billChange -= 10  #5
            counters[5] -= billChange/5if counters[5] < 0:
                return False
        return True

455. Assign Cookies (easy)

Between two sequences, find the maximum pairs that sj>=gi, the greedy choice is we assign the closest size of cookie to satisfy one kid, min |s_j — g_i|(for each j), if we sort these two lists, then we go through the g list, then the first element in S that is >= to the current one then it is good. We use two pointers each for each list, and the time complexity would only be O(n).

class Solution:
    def findContentChildren(self, g, s):
        """
        :type g: List[int]
        :type s: List[int]
        :rtype: int
        """
        if not g or not s:
            return 0
        g.sort()
        s.sort()
        i, j, counter = 0, 0, 0
        while i < len(g) and j < len(s):
            if s[j] >= g[i]:
                counter += 1
                i += 1
                j += 1
            else:
                j += 1
        return counter

874. Walking Robot Simulation

Answer: for this question, I think the most important and difficult is not about the algorithm itself, it is about how to implement the change of direction, and how to check the obstacle efficiently.

class Solution:
    def robotSim(self, commands, obstacles):
        """
        :type commands: List[int]
        :type obstacles: List[List[int]]
        :rtype: int
        """
        
        dx = [0, 1, 0, -1]
        dy = [1, 0, -1, 0]
        x = y = di = 0
        obstacleSet = set(map(tuple, obstacles))
        ans = 0for cmd in commands:
            if cmd == -2:  #left
                di = (di - 1) % 4
            elif cmd == -1:  #right
                di = (di + 1) % 4
            else:
                for _ in range(cmd): 
                    if (x+dx[di], y+dy[di]) not in obstacleSet:
                        x += dx[di] # each step we increase one
                        y += dy[di]
                        ans = max(ans, x*x + y*y)
                    else:
                        break
        return ans

Greedy Algorithm Explained using LeetCode Problems

1. What is Greedy Algorithm ?

2. Guideline of Greedy Algorithms

Types of Greedy Algorithm

Two Types of Proof Methods

3. LeetCode Examples

3.1 Activity-Selection

3.2 Jumping

4. Exercise

874. Walking Robot Simulation

5. Answers

874. Walking Robot Simulation

Written by Li Yin