Foobar Update and Adventures in Optimization

Published in

Atha Data Science

5 min readJul 14, 2020

June 11th, 2020

So I’m back, writing from the other side of my time inside Google’s foo.bar, and ready to reflect on what a transformative learning experience that was.

Ultimately, this go around, I made it through the first 2 (of 5) levels. And honestly, I have no hard feelings about that at all — I began teaching myself Python two months ago, and I was able to engage with and pass several of the hardest coding challenges I’ve ever encountered, published by Google, and had an absolute blast doing it. So let’s review!

Level 1 I described in my last post, but will elaborate on to say that my first (“brute force”) approach to the problem did not utilize list comprehension at first, but rather for-loops and iterative appends to lists. Indeed, I had only just been introduced to list comprehension and this challenge was my first real application of it. The problem itself had to do with generating and accessing a very long concatenated string of prime numbers. Version 1 was definitely slower, and the dramatic improvement of performance seen by implementing list comprehension was a really rewarding and tangible way of learning the benefit of this method.

In the second level, optimization would become integral. Level 2 consisted of 2 challenges, the first of which involved accepting as arguments numbers of any base from 1 to 9, performing specified operations on them until the output values ended up in a repeating loop, and determining how many repeating values in that loop the output was alternating between. I solved this problem in about 3 hours, and with great efficiency, and was quite proud of my solution and ability to pass all the test cases.

Level 2’s second challenge proved a great deal more difficult. It was an interesting and tricky take on creating a sort of inverted Floyd’s triangle, but incrementing by 1 along the diagonal instead of row by row — which causes all of the triangular numbers up to (I determined to be sufficient) two times value of the index you are trying to access to end up along the “x-axis”. Creating this triangle proved to be quite the mental challenge, and when it was said and done I had done it with 4 completely different approaches.

The key difficulty with the problem was that the challenge wanted me to access which number was located at certain “x” and “y” coordinates, meaning find what number when appending incrementally larger numbers on the diagonal ended up in a certain row and column. Essentially, it wanted a structure which could be indexed in 2 directions, with indices of up to 100,000,000 each. And in all of my iterative designs, I had it working! Except, it was each time failing just 2 hidden test cases, which I imagine must have been very large numbers, since every version of my program would time out with very large inputs. Why you ask? Well, that all has to do of course with optimization…

My various designs included: first creating a Floyd triangle comprised of lists of numbers, then indexing those lists in such a way to build the desired triangle by rearranging the original triangle. Appending values to a set of lists and then appending to a second set of lists, definitely rough on the computational time and power. But, this brute force method was at least passing all but those 2 test cases, so from this framework I went about optimizing.

Other designs to try to mitigate the time-out-inducing processing complexity included:

First appending to a dictionary with a for loop and then using dictionary comprehension to first make the Floyd’s and then the manipulated Floyd’s triangle, thinking that the use of both comprehension and dictionaries (must faster than lists since then can be accessed by keys instead of indexed!) would improve computation time
Using list comprehension followed by dictionary comprehension to do the same
Using only dictionaries to make the two consecutive triangles.
Trying to implement a generator to remember where in the first triangle creation the program was, to create the second from that place directly, without having to save all the numbers to memory (unsuccessful).
Using a double ended deque with a fixed length as a sliding window to try to do the same (unsuccessful).
Using numpy arrays (but I don’t really know anything about these yet and couldn’t get them to work).
Then finally shifting approaches completely and figuring out a mathematical formula to create the sequence of numbers in the inverted target triangle directly without first creating a Floyd triangle. The thinking was that I could hopefully cut the computation time in half (and it was a very tricky formula to derive!!) I created the inverted triangle directly, two ways: via two loops of appends to dictionaries, and another way with list comprehension.

They all failed the 2 hidden test cases.

I don’t even know for sure that it was because of computation time. I guess there’s no way of knowing what test cases I was failing as they were hidden, but none of the above would process fast enough to yield an answer for very large input numbers, so I assume it was still complexity issues.

But I’m not mad! I approached this problem so many different ways, and learned so much. In the future, I’m sure I’ll have more tricks up my sleeve that I can try, and get this to work.

Just after running out the clock on the 72 hour timer for the challenge, encountered this rather apt quote somewhere buried in a stack-exchange, on the subject of large data sets… “[they shouldn’t be a problem unless] naive use makes many small appends.” Well, call me naive — but only for now. I knew I needed to eliminate the repetitious appends which were sending my big(O) through the roof — I just didn’t know what to replace them with. And that’s exactly why I’m continuing to improve and build my skills.

I surprised myself with how very passionate about this I am — I was coding, continuously, about 12 to 15 hours a day, and doing so excitedly — which only affirms that I’m doing the right thing. And I proved to myself how many different ways I’m able to reshape and look at a problem, bringing in new creativity when I hit a wall. Last, I learned just how much new info I’m able to absorb and implement effectively on a pretty immediate basis. This was a huge moment of growth for me as a python developer.

I’ll be keeping the various entries on my github for reference: https://github.com/anterra/google-foobar/tree/master (don’t worry, Google randomly generates numerous versions for each applicant, so your ability to cheat off of these is very low. Plus, foobar is about the learning experience! Try it on your own 😉). I can’t wait to come back to the foo.bar someday in the future with more experience under my belt!

Foobar Update and Adventures in Optimization

June 11th, 2020

Written by Anterra Kennedy