Yves Mulkers
Jul 10, 2017 · 3 min read

The New Science of Building Successful Data-Driven Apps

The New Science of Building Successful Data-Driven Apps
The New Science of Building Successful Data-Driven Apps

A few years back, when Bit.ly needed to burn in its new Hadoop cluster, the company’s chief scientist of the time, Hilary Mason, and her team decided to use the cycles to analyze three years worth of Bit.ly traffic simply to determine if cats or dogs were more popular.

Dogs won, it turns out. And Mason freely admitted that the project as “a massive waste of energy.” But it also offered a valuable lesson in what was possible.

“The idea that we had computation power so cheap that we could apply it to something so absolutely trivial really blew my mind,” she told the audience at the Anacondacon conference, held earlier this year in Austin Texas. Something that wasn’t even feasible a few years back is now so easy to carry out that it could be applied to even the most trivial uses.

And that is where we are with data applications today, Mason explained.

Thanks to abstractions and frameworks, development that used to be done in weeks by a team now can be done in a single day, by a single coder. There’s a lot of good work going on with natural language design, unstructured data parsing and other data analysis techniques. And by this point, most businesses have a glut of data merely as a side effect of doing business. Why not put all these elements together?

This year’s conference, run by Continuum IO, addressed how the company’s distributions for Python and the R statistical language are being increasingly used for this sort of data science work. And Mason was the perfect person to address this emerging market. After leaving Bit.ly, Mason founded Fast Forward Labs, a Brooklyn-based consultancy focusing on helping organizations design data-driven apps.

Mason defined data products as any app or service that relies on data to produce value of some sort for the user. Perhaps the best example of a successful data product is Google Maps, which relies on real-time location data and a set of prediction algorithms. Mason noted that Google Maps has been successful chiefly because it is “boring,” she said. “Anyone in western society can look at this and they’d know how to read it.”

“There’s a lot of computation here but you don’t need to know anything about it to use this product,” she said. “And yet it could not exist without that computation.”

In general, a successful data app addresses some need of high value to the populace at large: executing tasks we would otherwise pay a lot of money to complete, and/or it executes a task that is important to do well. Google Maps exceeds in both.

Before you embark on building your own Google Maps, however, here are some issues to contend with that Mason advised bearing in mind (“and there are a lot of them, unfortunately,” she noted).

Here is one to keep in mind, for those thinking of releasing an internal app for public distribution: The generic formulation of the problem you are trying to solve is 10 times more complicated than the specific problem you want to solve.

Posted on 7wData.be.

Yves Mulkers

Written by

BI And Data Architect enjoying Family, Social Influencer , love Music and DJ-ing, founder @7wData, content marketing and influencer marketing in the Data world

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade