Preparing for my Amazon internship interview

Yotam Gafni
5 min readFeb 25, 2020

--

I’m in my second year of Masters degree so all the cool kids are scrambling for summer research internship positions in the big companies. This way you can enhance your research with practical problems and also earn some extra dollars. I was setup an interview at Amazon Haifa and was considering how to prepare when I saw the problem sets in the GeeksForGeeks website. I don’t know if it’s pure branding or real, but they have a large section of problems that are titled ‘job interview questions’ and are presumably extracted from job interviews in different companies. Some of this questions are really neat. I’m dedicating this post to the nice tricks and ideas I encountered in these problems (I uploaded my solutions to a github repository ). This was my first experience with what’s called ‘competitive programming’, so it took some time to get accustomed to the format and parsing the input. It is not very well documented in the site. I suggest you first solve 1–2 easy problems just to get used to it. You can see in my code how to do it in python.

I will give the problem with a link to its description, and then describe the interesting parts of my solution.

Minimum sum partition

In general, the problem as defined is NP-complete. If you can find the minimum difference, then if a 0-difference partition exists you will find it, which means you have a solution to the subset sum problem, which is NP-complete.

So by this simple reduction we understand that to find an efficient solution, there must be something unique in the parameters of the problem. The thing you have to notice here is that the sum of the array is bound by A[I]*N <= 50 * 50 = 2500. So we could try to use dynamic programming to mark all feasible partial sums in the array. This approach is similar to the one used in the textbook knapsack FPTAS. You keep an array for all possible partial sums. With each element, you take all the already feasible partial sums marked as 1 in the array, and you mark their sum with the extra element as 1. After you pass through all the elements, the table has all the feasible partial sums. Now look for the one closest to the half of the array sum.

Bonus — once you completed this functionality, you can easily solve https://practice.geeksforgeeks.org/problems/subset-sum-problem/0 and get extra points !

Number formation

I had a hard time with this one.

First, it’s not totally clear what values you want to compute and what recursive relation to use for it. At the end I took the following approach — I generate a table that keeps the sum of all numbers that contain exactly x 4 digits, y 5 digits and z 6 digits. To calculate this, fix some digit in the x+y+z digits the number will be comprised of, and count the number of combinations where it will appear in this spot. If the digit is 4, then it will appear there (x+y+z-1)!/((x-1)!y!z!), and similarly for 5 and 6 with appropriate changes. To account for all the different decimal spots, we multiply this number by 111….111 (x+y+z times).

But then, calculating these factorials everytime is very time consuming. So we do it recursively with dynamic programming and keep a table of it.

This is still not good enough — you run into horrible floating point errors when you start to work with the really large factorials. For this you can use the fact that you are only required to return the answer modulo some not too large prime number. Instead of doing arithmetics in Z (the natural numbers), you do arithmetics in Zp (modulo this prime). The only change this requires is that anytime you want to divide by some number, you instead multiply by its inverse in Zp. For this we preemptively prepare an array of all inverses in Zp, which we calculate using Fermat’s little theorem.

The last important note is that we generate all these tables in advance before we run any specific case. Because they are shared for all cases, we only need to do it once. This proves sufficient to pass the time tests.

Ugly numbers

Not a good name for the question — what’s so ugly about these numbers ? But leave that aside. We look for the first 10⁴ numbers that are only multiples of 2,3,5. The approach I went for is based on the fact that given some numbers are the first n of this form, the next one must be a multiple of some number in the set with 2,3,5. This basic observation allows us to maintain a list of candidates for the next step. For it to not blow up exponentially we need it to be a unique set, and we need to keep it ordered so we can easily pop the next number. I had some trouble finding a good data structure in native python (no exotic imports) that can implement both, so I used a set for uniqueness and heap for ordering. If you’re a fanatic, you can wrap them together in a class that gives the combined functionality.

The following two problems guarantees seemed imaginary — I initially thought there must be some mistake in the description. I ended up implementing the amazingly beautiful algorithms that solve them, I highly recommend looking into them as they are very elegant.

Longest Palindromic substring in linear time

In this problem, it’s obvious that you need Omega(n) (at least n operations), and also that it’s easy to achieve O(n³) (look at all possible starting places, all possible lengths, and then check each candidate by running through it). It’s also kind of obvious that there are many connections between candidates and you don’t need to really check all one by one, but Manacher’s algorithm really optimizes this to achieve a best possible asymptotic solution.

Merge without extra space

There is a 1998 paper that solves the general problem of merging an array in O(nlogn) time and O(1) space, so it sounds ludicrous that in a job interview someone will expect you to solve a problem that was only determined in an article not so long ago. But in our case the array is in fact ‘almost sorted’ in a sense and so there is a very elegant solution. What’s intriguing about the solution is that unlike regular shell sort, where you need to really sort each offsets subarray, In this case going over the subarray and substituting neighbors is enough to sort it. Try proving why — has to do with invariants that are kept from the initial state to the last.

And now for the funny bit — I arrived at the Amazon interview and they didn’t ask me any programming or algorithm questions. We just discussed research interests and expectations, so nothing of this sort. But I will have a second interview later this month, where all this practice may come in handy.

--

--

Yotam Gafni

After some years in the start up industry, I ~retired to do my Phd in game theory and fair allocation, which I’m currently pursuing.