Lessons learned from Google Hash Code

plapadoo
plapadoo
Published in
3 min readMar 4, 2018

“In programming, the hard part isn’t solving problems, but deciding what problems to solve.” — Paul Graham

As we had to find out at this year’s Google Hash Code, solving problems is also very hard.

Hash what?

A little context, first: Google Hash Code is a programming competition. You get a task from Google, along with a few input files. You’re supposed to write a program that reads an input file and produces an output file which solves the problem. The hard part: you have to do it in under 4 hours!

Cars

This year’s problem involved scheduling rides. Suppose you’re in a city. People want to get around. And now imagine you’re in the future! So instead of taking the bus, people tell their smart phone: “get me from A to B”. And some time after that, an autonomous vehicle shows up at the intersection you’re standing on and drives you to your destination.

That’s all fine and there’s no problem, really — except, you know, for designing the self-driving cars, but let’s suppose that’s solved. However, the city is big, and there are lots of people that want to get around. And there’s lots of self-driving cars. How do you tell the car where to go next?

And there is the problem. You have your routes (people tell you in advance when they want to go from where to where) and your cars, and you’re supposed to figure out which car takes which routes.

Our approach(es)

It was interesting to observe how we spent the 4 hours of work. The first few minutes we spent understanding the problem. Everybody carefully read the problem statement PDF file for themselves. The next 90 minutes (which, in hindsight, was too long) were spent discussion possible approaches. In this stage, you could really tell which team member studied which subjects in college. The systems scientist opted for an agent-based model, simulating each car as an agent in turn. The complex scheduling guy had an annealing approach in mind.

So as a result, we decide to implement two solutions, and see which one performed better. One clever, but a little complex agent-based approach, and a really simple one that’s based on simulated annealing.

In the end, only one approach got a result, the other, more clever one, didn’t finish in time. We implemented both in Python, since it offers lots of ready-made scientific libraries, and using libraries was explicitly allowed.

Us watching the final live stream

What we learned

There are a few takeaways from this competition:

  • Know your libraries, know when you’re out of your depth: We noticed pretty quickly that, although we knew the widely-used numpy library was a good way to write efficient array-based code in Python, and we thought we knew this library, that we spent lots of open browser-tabs googling functions we actually didn’t know. And in the end ditched numpy altogether.
  • Use the same tools you use in production code: When you don’t have much time, you might be tempted to omit verification tools and linters from your workflow, since those take some time to do their work, and you’re writing a “small” program anyway. But we noticed that there were lots of places where we’d have made really stupid mistakes, were it not for pylint and mypy to save our asses.
  • Check your clock: Programming can quickly take up your whole mind, letting you forget to eat, drink, and…watch the clock. Set timers for yourself. Discard things that take up too much time and replace them by simpler things.

--

--