Winning with data — your guide to the data-driven world.

Andrei Rebrov
HackerNoon.com
Published in
5 min readOct 4, 2016

--

A couple of months ago I was at the AWS Summit and had a conversation with Looker team. We’ve talked a lot about data analytics, existing solutions and approaches and before I left their representative gifted me a book that I want to review today. The book is called ‘Winning with data’ and it’s great because it gives you a clear path how to start working with data you own but probably don’t how to use properly.

That was another reason why I enjoyed this book so much — it came right in time because at Scentbird we’ve reached that stage when we have a lot of actionable data, and it’s important to know screw it up from the beginning. And that’s one of the problem authors highlighted in the beginning. To be precious, there are four significant problems with data:

  1. Data breadlines for the data-poor — a lot of departments suffer because it’s hard to access the data;
  2. Data obscurity — there is no easy way to understand what kind of data is right for you;
  3. Data fragmentation problem — big and medium organizations tend to create a lot of separated data sets that are not aligned;
  4. Data brawls — different teams treat the same data in a variety of ways.

These problems were the same for the different organizations, so that’s why we can find similar solution all over the internet:
- Google with Sawzall and Dremel
- Facebook with HiPal
- AWS Redshift
and many others.

But you shouldn’t be at that scale to become data-driven. There are some examples from SMBs:
- The RealReal uses reports on a daily basis to understand what’s going on with their operations and revenue and how they should align their marketing and merchandise plan to improve it.
- ThredUp named one of America’s Most Promising Companies, has very complex operations, that requires to process and catalog thousands of items every day, so data-driven approach helps their teams to meet their KPI, understand customer engagement and see general trends.
And there are more examples from Zendesk, Warby Parker, HubSpot, and DonorsChoose.

So what does it take to become a data-driven company? Usually, there are some basic steps:

  1. Ask the Engineer — every time your business team wants something they ask one of the developers
  2. Access Raw Data — your dev team creates a simple solution that allows exporting raw data (for example in CSV)
  3. Bring Your Own BI (BYOBI) — in-house solution for solving data problems (sounds as a good solution, but still requires technical and business teams to work in a close touch)
  4. Data Fabric — next level BI that helps business team to access data in more or less convenient way (like Looker, RJMetrics, Chartio, etc.)

But even the best tool is just a tool, and you should be very careful to avoid data biases. The most common is a “survivorship bias.”

During World War II, the statistician Abraham Wald took survivorship bias into his calculations when considering how to minimize bomber losses to enemy fire. Researchers from the Center for Naval Analyses had conducted a study of the damage done to aircraft that had returned from missions and had recommended that armor be added to the areas that showed the most damage. Wald noted that the study only considered the aircraft that had survived their missions — the bombers that had been shot down were not present for the damage assessment. The holes in the returning aircraft, then, represented areas where a bomber could take damage and still return home safely. Wald proposed that the Navy instead reinforce the areas where the returning aircraft were unscathed, since those were the areas that, if hit, would cause the plane to be lost. (Wikipedia)

There is an interesting case of fighting “survivorship bias” at Facebook.

One of the approach to avoid this is to teach all your team in the same way like it’s done in Zendesk:

  • SQL — people learn the most basic language for asking data questions
  • Data architecture — where to find all the different data sets
  • Data dictionary — a review of key metrics and their definitions
  • Case studies — accounts of previous problems and how they were solved
  • Basic statistical concepts
  • Storytelling with data — how to construct an argument with data and vizualisation
  • Actionability — determining whether this data analysis will result in a tangible change in the way the company operates

When these steps are done, it finally comes to the most interesting part — asking the right questions.

https://logianalytics.com/definitiveguidetoembedded/the-future-of-embedded-analytics/

And that’s we can find in between steps 2 and 3 in the Gartner data sophistication journey. This missing step is called Exploratory Data Analysis described by John W. Tukey almost 40 years ago. One of the most important things about exploratory data analysis is that it’s easy to understand how good your questions is, just answer “What decisions would that analysis inform?” So, no matter what type of analysis you perform a proper action should always follow it.

And last but not least — you should know how to present your data, so it’s easy to understand, it’s not boring, and it’s actionable. I plan to cover this topic in the next book review about data visualization.

“Winning with data” is a great, interesting book I recommend you to read.

P.S.: This book has a lot of links to useful information, some of them I want to share:

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMI family. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!

--

--

Andrei Rebrov
HackerNoon.com

CTO & co-founder @ Scentbird, YC alumni. Passionate about tech, space, sci-fi and video games. Live in NYC.