Why big data will always need humans

So apparently big data is the new holy grail. As human beings start to collect data for just about everything from their heart rate to their spending habits, businesses track every single customer interaction whether online or offline, and governments continue to open up the vast reams of data they’ve collected over the years, it’s safe to say there isn’t a shortage of the stuff.

The growing adoption by your average human of hardware that’s capable of collecting data, be it health and wellness or even air quality and pollution, means attaining data is no longer a problem. Sure, it’s not in the hands of billions of people, but what we are starting to see is exceptional concentrations of this technology in the cities of developed countries — collecting data to power the businesses of today and tomorrow.

The Jawbone UP3 is one of the most comprehensive trackers on the market

The real challenge with all this data is actually doing something with it. And that’s the part I think many companies are struggling to comprehend. Data is not knowledge, and data is not decisions. Which is where human beings come in.

But that’s the thing with humans — we seem to think technology is the answer, when the reality is it’s just our tool to ask questions. Technology will never possess genuine curiosity, and will never truly empathize with the human condition. Which is why human beings are such an important part of big data, and always will be.

For that reason, I don’t believe businesses should be working too hard right now to deliver insights based on what they have. I just don’t think many businesses have the data, the technology or the people to make it valuable or scalable. For context, when I think of insights, I think of a sequence of 3 questions:

What happened?

Why did it happen?

Therefore what?

As of today, technology is only capable of answering (without human support) the initial ‘what’ — although the disparity of the technology, in particular hardware, means the consistency and quality of this data is much to be desired. However I do not believe technology is even close to helping us answer the ‘why’ and the ‘therefore’. In both instances human beings are required to spend time with this data in order to answer those questions. It just so happens that those human beings who are capable of mining genuine insights from data, are a dime a dozen (and commands salaries involving several million dimes I might add).

That said, even if you are lucky to snag yourself a world class data scientist, I really think we are a long way away from deducing radical insights from this data for a few reasons. Firstly, is the vast amount of useless data. Just because you know something, it doesn’t mean it’s something worth knowing. We’re collecting a lot of data just because we can — but is it really useful? It is definitely a case of quality over quantity, but the reality is you need significant volumes of that quality data to make it worthwhile. And that doesn’t happen overnight. And to boot, habit forming with human beings with this technology is still not established. Do you know anyone who has really stuck with their Jawbone UP, or manually calorie counting for example? Until the collection of this data becomes so ingrained in who we are as people, it will always be unreliable.

Secondly, is our old friend the ‘P’ word — privacy. The open data revolution is well underway, but that doesn’t mean privacy is forgotten. I’m actually of the controversial opinion that privacy will really not matter over time. Legislation will change to empower people to take control of their data, and opt out will always be an option, but I truly believe as human beings start to see the value of sharing their data, they will start to care a lot less. Now that’s still a long way away, and in the meantime human beings are working hard to keep their data on lockdown — meaning the quality data is relatively inaccessible.

Now I don’t believe that no-one is bringing the human touch to data. In fact there is one company I believe has the data, technology and people to enable them to do this well. And that company is Foursquare. Over the past 6 years, it is fair to say Foursquare has had a turbulent journey. Whether it’s unbundling their product, or raising a down round, they’ve never been far from the headlines. But one thing they have consistently been able to do, is build vast quantities of quality data.

Image credit: Adweek

And much of that is with thanks to some proprietary tech they created called Pilgrim. This technology has built a phenomenally rich dataset related to where people are, when they’re moving, and when they are stationary. When you pair this data with data on the individual — what they like and don’t like based on what they’ve explicitly told Foursquare, or even just places they’ve been, you now realize you’re dealing with the most powerful local recommendations tool on the planet. But it is how these recommendations are delivered that for me make it so special.

  • They offer options — Human beings don’t like to be told what to do, and they like to feel like they’re making their own decisions. Giving too definitive direction might not land so well. Foursquare gives options when you enter an area with places you might like, based on where you’ve been previously. I like to be able to decide for myself.
  • Always timely — I don’t know how they do this, but the timing of these recommendations is always impeccable. Many a time I’ve at down for dinner at a new spot, and within a few minutes I get a push giving me a recommendation or two for what to have off the menu. Five minutes later and that recommendation is redundant. Timing isn’t easy, but Foursquare have it down to an art form.
  • It feels human- If I feel like it’s a computer talking to me, I tend not to listen. A push notification isn’t exactly significant real estate with which to play with copy, however Foursquare manages to make it feel like a human being is telling me what’s hot. Or in some instances, it actually uses a human beings tip in the notification itself.
  • They’ve started simple — Foursquare’s data is pretty complex as it is, but it doesn’t try to overkill. Personal preferences and location data — that’s it. It isn’t trying to convince you that because it’s a sunny day you should walk 3 miles for a good sandwich (although I’d be surprised if this isn’t something we see in the future). You’d be surprised the insights you can mine from just the simple stuff.

Now it’s safe to say, a bad Foursquare recommendation isn’t going to kill anyone. However in the world of health and wellness tracking, I’d say that prospect is a little more real. Will tech ever really know enough about the human being it is measuring in order to provide legitimate insights on how to improve their health? Right now, almost certainly not. In fact, I think it can be quite dangerous.

And that’s why for me, human driven insights are so critical to capitalizing on big data. The curiosity and subjectivity of human beings is what makes knowledge so powerful, and when you remove this from the equation, true personalisation and context goes out the window.

This obviously reinforces how in its current manifestation, insights driven by big data is not a scalable solution. The human component will always limit your capability. Nevertheless, building a solid data science outfit and pairing it with investment in artificial intelligence means a future where computers can answer the what, why and therefore questions, is an inevitable reality. But they will always need a little help, from a human or two.