Building Falcon — HackMIT@Battle of the Hacks 2016

For the past three years, Andreseen Horowitz (a16z) has hosted and run Battle of the Hacks, a 60-person, 24-hour hackathon for hackathon organizers. This year, I had the opportunity to go representing HackMIT, along with my teammates Anthony Liu, Anish Athalye, and Logan Engstrom, and we decided to build Falcon.

What is Falcon?

At Battle of the Hacks, we wanted to tackle a problem that has long remained completely unchecked: browser history. We had all had the experience of visiting a cool webpage, then wanting to revisit it a couple days later, only to be completely let down by our browser, computer, and Google search. The reasons for this can vary — sometimes, we remember a webpage by meaning, or a distinctive image, rather than by the URL or title — sometimes, we viewed the page on another device! Take, for instance, the following three examples:

The first example, titled Simple Contracts are Better Contracts: What We Can Learn from the Meltdown of The DAO, is about recent discoveries regarding vulnerabilities in the ethereum platform. Now, if you took a quick skim of this article and remembered ethereum, then wanted to revisit it a couple days later, you were out of luck; the chrome omnibox, browser history, and a quick google search with some more keywords all failed to turn up this article. With Falcon, it is as simple as searching “ethereum”. Falcon also presents the huge advantage of being cross-platform: as you can see from the second screenshot, the second article was visited from an android phone, but a quick search through Falcon for “air force” or anything else you remember will once again bring it right to your fingertips.

What we thought was the coolest feature of Falcon comes from the third example. As you can see, Anish’s friend has sent him a picture of a someone riding a surfboard. Now, let’s say it’s a few days later, and Anish wants to show someone else this picture, but he can’t remember who sent it on which of the many different messaging platforms Anish is subscribed to. In comes Falcon:

Falcon auto-tagged this image with relevant content (The tag “a man riding a wave on top of a surfboard” is generated, not extracted from metadata — you’ll see how in the next section), and Anish is able to find his picture within seconds!

How does Falcon work?

There were really four parts to Falcon: the clients, the agents, the server, and the tagger. The agents are what relay the user’s history back to the server; in our case, a Chrome extension and an Android app — -when the user visits a page, the agent sends the page source, along with a b64-encoded screenshot of the page as the user sees it. This data, when received by the server, is then processed in multiple steps: first, the real text is extracted from the HTML (removing markup, comments, etc.) and then loaded into elasticsearch, which allows us to index and search the data very quickly. Next, the images are extracted from the page, and fed through a deep recurrent neural network (RNN) — -we chose to run NeuralTalk2 using the Torch library and the Lua programming language. We set up and ran the RNN on our own server, so that we could input the captions directly into our database, making them searchable too. Finally, the client, which we chose to implement in the forms of a native Mac Client (written in Objective C), and a Chrome omnibox extension, accesses the server, and does either a fuzzy search (for text) or a tag search (for images) for exactly what the user is looking for!

What I learned

I would say that I learned two big lessons from Battle of the Hacks v3. The first, and more hackathon-specific, is to know what you are going to do beforehand, and know it well. Before Battle of the Hacks, we had planned out Falcon down to the API routes we were going to use — that way, even if you were working on the client (as I was), you didn’t need to wait for anything from the server; each one of us was writing relevant code within 10 minutes of the announcement to “Start hacking,” and people noticed. With our technique, we were able to finish pretty stress-free well within time, get a couple hours of sleep, prepare our presentation in the morning, and have fun while we were at it. Which brings me to the second thing I learned: hackathons can actually be a lot of fun! In fact, most of the time, it didn’t feel like I was building a product for $25K; I was just building a fun side project with some friends, blasting good music and eating good food — I’ve never really felt like that at a hackathon, but I hope I can start to.

All in all, Battle of the Hacks was a really great time! Falcon ended up winning the grand prize, and people seemed to relate pretty strongly to the problem we were solving. You can check out our splash page, falcon.kim, an odd homage to our HackMIT director by a sleep-deprived teammate, or the Falcon devpost page, here.

A final note: While we were hacking, we made (what we think is) a pretty awesome playlist, and other hackers seemed to like it too! Check it out:

Originally published at http://andrewilyas.com/posts/falcon-bh3/.