Big Data: Separating Hype from Reality

An interview with Matt Kuperholz about why so many people get big data wrong

One the things I like most about being a part of Future Crunch is that I get to meet a lot of very smart people. There’s plenty of them around in our hometown of Melbourne. Its unique mix of creative communities, economic prosperity and liveability make it a pretty attractive destination for people who want to work on things that are new and interesting. One of those people is Matt Kuperholz, who I was introduced to through my housemate towards the end of last year. Matt is a partner at PwC Australia, and in charge of their data analytics team. It didn’t take long for the two of us to start talking about digital culture; at the time I was reading Jaron Lanier’s Who Owns the Future, and was doing a lot of thinking about open-source movements and digital currencies. I was immediately impressed by Matt’s ability to get to the heart of these issues. He’s someone with the rare ability to talk about the inner workings of the internet in a way that was accessible to a non-technical person like myself, without dumbing it down or losing the complexity.

I wonder if you could tell me a little more about your background?

I was born in 1972 in South Africa, and have been into technology since as long as I can remember. I was one of those children that took a toaster to bed instead of a teddy bear. I’ve been into computers from an early age so my decision to work in the field was a deliberate choice. Computers weren’t everywhere in the 1970s like they are now but I knew they were going to matter. I got a scholarship to become an actuary (because that’s what you did back then if you were good at maths) and studied at the University of Melbourne. But computer science was my passion and it didn’t take me long to see that all the actuarial stuff I was doing could be made better with a higher degree of integration with technology.

So you were in the startup world — how did you end up working for larger businesses?

The beauty of being in a startup was that I was exposed to lots of different industries. Right from day one we were talking to potential clients about the value of an asset they all had — data. And of course the value of that asset changes depending on what you do with it, how you run it, how you twist it. I took those lessons and launched my own consulting business to teach companies how to manage their data. I did that from 2001 to 2011 where my main client was a top tier global consulting firm. They were ahead of most competitors as they took a bet on data early, and I helped them build a data mining practice that was the first of its kind in Australia that then went on to be a global leader. Right now I’m now working with PwC, helping them do something similar but better.

OK so what is big data?

A working definition is that it’s more information than you can handle with traditional approaches, which means you have to do something different. It’s not necessarily new though. For example I helped a large retailer look at every advertisement in their catalogues back in the 1990s and align those all with their advertising revenues, and then every single line item of every single sale over time. Does that mean we were doing big data 20 years ago? Another example — what about airlines in the 1970s and 1980s? They had to take data from 40 disparate systems to talk about the entire customer journey, from their marketing to ticket purchases, to check-in and then the flights themselves. That counts as big data by almost any definition you care to use. This stuff is not new.

Does that mean there’s a gap between what people say big data is and what it actually is?

Big data is mostly a marketer’s term. I’d love it if people dropped the term big data and just used analytics. People are drowning in data and aren’t sure what to do with it, and this has been the case for most of this century. For example let’s say you’re a big manufacturing company with thousands of sensors installed in your factories that are taking readings. That’s easy these days because sensors are cheap. But even if you crunch all the data and use it to produce reports you’re still not using it to extract maximum value — you’re just looking backwards. One of data’s big selling points is that you can use to produce real time information. But you have to ask yourself “how many organisations actually need that?” Sure it’s useful for border patrol security, or air traffic controllers, or Amazon. But most ASX500 companies don’t need so much real time information. They just need enough information to make decisions about what they’re going to do in a week or a month’s time.

It’s not just companies that are doing big data though right?

Once you get out of the private sector it changes a little. Some parts of government are actually pretty good at big data because they need to be. The USA and Israel for example have a huge security apparatus that depends on getting their data analytics right. In some of the more classified areas it’s really bleeding edge stuff — things like real time facial recognition, telephone conversation scanning or other intelligence flows. The reality is that private sector companies don’t need to track millions of entities across a wide variety of sectors like the government does. They just need to analyse one sector. And right now they don’t need to be perfect, they just need to be one step ahead of their competitors. Governments have a duty to protect and serve their citizens whereas the private sector just needs to make sure it’s profitable.

Where do you see big data (sorry, analytics) headed in the future?

The IoT is going to be massive. Connected devices are going to create an even steeper curve. And they might force us into federated data models whereby you don’t own the data, you just have access to it. In my mind that’s really interesting for the greater good of society. Right now data is regarded as an asset, so if you’re in a commercial environment you don’t share it. The concept of federated data is that it’s semi private; so all owners go into a common pool where everyone gets to use it. A nice example of this is SENSE T in Tasmania which is being used to monitor everything from oysters to air quality.

What’s the flipside?

As we’re now all too aware we might end up throwing civil liberties out the window. Facial recognition is a technology that’s going to be ubiquitous in the future. We’re going to end up in a world where you really cannot hide. Also, are we aware of what all this electro magnetic radiation doing? Will optimisation increase the disparity between rich and poor? What about warfare? History says we invent stuff and then make weapons out of it. And what about machines becoming self aware? Like anything this new technology has the power to bring people together but also to drive people apart. What if data is the currency is that’s going to drive a wedge between us all?

So why should people care?

Any discussion of the future of big data reminds me of Hoffstader’s Theorem: everything always takes longer than you think, even when you factor in Hoffstader’s Theorom. In the very long term I think data is key to our evolution. Remember, we’ve already transcended space and time. We can create physical objects at a scale of nanometres; we can’t see anything at that level with the naked eye and yet there are things down there that we’ve created with intentional design. And thanks to our machines we can now think in gigahertz. We’re able to process thoughts or questions far faster than was physically possible just a few generations ago. Our brains operate in a space from milimetres to kilometres, and in time from a fraction of a second to a lifetime. But our machines create things at the atomic level, and operate in microseconds.

--

--

Angus Hervey
Future Crunch

From Melbourne and Cape Town, with love. Political economist and journalist, and co-founder of futurecrun.ch