My First Failed Algorithm —The Classic ‘Weighted Job Scheduling’ Problem
Participating in a coding interview has to be one of the most intimidating aspects of conducting a job search as a new software developer. You have to pull yourself away from the projects you are working on, and dedicate some of that time to stretching your problem solving skills, and understanding of data structures and algorithms, just to have a shot at an offer. Problem solving can be a lot of fun, and is a source of enjoyment for many in the field, but when you have a potential offer riding on your ability to develop an optimal solution, the pressure can be immense.
My first coding challenge was a couple months ago, just as I finished bootcamp and was trying to wrap my head around heaps and Dijkstra’s algorithm. The position was for new-graduates who would be placed in a rotational program in the software team for this social media company. I felt it was a great fit for my entry level skills, and was fortunate to have my resume accepted. Then came the coding challenge. I received an email containing a link to complete an automated test within the next 3 days. I spent the first 2 blowing through an algorithm and data structure course online, hoping any of the information would stick and help me solve the problems. I hadn’t put in the time I needed in order to feel prepared to tackle the challenges.
Once I started the test I was relieved to see the first 2 questions weren’t complicated. They involved using binary search and multiple pointers, fairly easy skills to pick up and understand. Then came the third and final problem. It was a version of the classic algorithm that I had yet to encounter, the ‘Weighted Job Scheduling’ problem. The problem goes, given an array of jobs, each containing a start time, and end time, and a profit, determine the maximum profit you can obtain from completing non-overlapping jobs. The problem can be arranged in many different ways. I have seen it described as ‘2 arrays, the first array, n, containing start times of presentations, and the other array, m, containing the length of presentations, where n[i] relates to m[i]’ and you are tasked to find the maximum presentations that can take place. The input could be one or two arrays, or an object, containing more arrays or objects with keys. For my problem, the input could look like the following:
jobs = [[1,2,15],[1,3,20],[2,4,40],[3,5,75],[4,5,50],[1,4,35]]
jobs[i] = start-time
jobs[i] = end-time
jobs[i] = profit
So, having not practiced much dynamic programming, I was naive of the complexity of the problem, and how to solve it. I tried to brute force my way to a solution, given the first test case, but each test case proved to break my solution as the input grew and became unsorted. Long story short, I failed the challenge and did not progress in the process.
This is a problem looking for an optimal solution, and it wasn’t until I understood how to use tabulation, that I was able to crack the solution. Looking back at that input, if we try to find the optimal solution while we go through the array, it can look like this:
jobs = [[1,2,15]]
optimal solution is jobs as its the only optionjobs = [[1,2,15],[1,3,20]]
optimal solution is jobs as jobs > jobs and they conflictjobs = [[1,2,15],[1,3,20],[2,4,40]]
now [jobs,jobs] is the optimal solution as they don't conflict, and generate a larger combined profit than jobs.jobs = [[1,2,15],[1,3,20],[2,4,40],[3,5,75]]
but now [jobs,jobs] are optimal
As you can see, every time we consider an additional input, we have to put that in context of what we have already seen, and reevaluate what is optimal, which seems to be an complex task when dealing with large data sets. How can we keep track of every potential solution as we take in more data? The answer is tabulation.
By storing the maximum optimal profit of the data we have evaluated each step of the way, we are able to grab that profit and add it to our next compatible job as we assess each job. All we need is an array and a helper function.
In our main function, we could accept an array, but for this example I hard code one in. We then sort that array by job start time. Where the start time is the same, we sort by end time, and where that is the same, we sort by profit. This also helps a lot with understanding how the function is working. After this sort function, our array would look like this:
We then pass in our array and its length. If the length is zero, we know there is no profit, and end the function. We then create our ‘table’ array for our tabulation, in order to store our rolling maximum profit. We make the length n+1, so we can store and update our return maximum profit in table[n]. We set the profit of table and table[n] to array, because at this point, that is currently our max as there is no other profit to consider. Now, since we have already considered our first job, we begin looping through our input from array. After grabbing our value, we run our helper function, and pass in our job array, the index we are on, and our table of maximum profits. This helper function, ‘findPreviousJob’, loops backwards through our job array from the index we were just at, and finds the most recent job whos end time does not conflict with our current job. Once it finds that job (jobs[j]), it returns that job’s associated maximum profit, table[j], as we know that our current job profit can be added to that number. If there is no previous compatible job, we return 0. As stated, we then add that returned 0 or previous compatible maximum profit to our variable storing the current job profit, which gives us the maximum optimal profit including our jobs[i]. We store this maximum profit in table[i], and if it is greater than our optimal profit, table[n], we update that value as well.
Again, our table array is storing rolling maximum profits for their respective job, where table[i] references the maximum profit that can be obtained up to and including job[i].
Once we have completed the table, we will have an array that stores the maximum profit obtainable when considering each job, and our last value, table[n] will be the maximum value from that array. With this given input from main(), our sorted job array looks like this:
[ 0, 1, 20 ],
[ 0, 4, 50 ],
[ 1, 2, 20 ],
[ 3, 4, 10 ],
[ 3, 5, 100 ],
[ 4, 5, 15 ]
Our completed table looks like this:
[20, 50, 40, 50, 140, 65, 140]
Lastly, our optimal profit is table[n] which is 140.
As you can see, while 65 was the maximum obtainable profit when considering the final job, it did not lead to our optimal solution because it was not compatible with the previous max profit, which is why it was important to keep track of each possibility, and store the optimal solution in table[n].
Tabulation and dynamic programming are powerful skills and it is important to be able to identify when those skills can be implemented. If you want a refresher specifically on these concepts, check out my blog on them here.
I am glad I was able to go back and crack the solution to this problem. I hope I presented it in a digestible way, and you are able to employ these concepts when challenged by a similar problem.