Data Analysts Need to Turn Up the Heat

Keegan O'Shea
5 min readMar 12, 2018

--

I recently watched John Scheinfeld’s 2017 documentary about jazz legend John Coltrane. In it, famed philosopher Cornell West claimed that:

“John Coltrane was ahead of his time. I don’t think that his music is a thermometer, it’s a thermostat. See, a thermometer just reflects the climate, a thermostat shapes the climate”.

Coltrane’s seminal 1965 album A Love Supreme was a jazz record so different that it spurred the creation of ’Spiritual Jazz’, a genre that is still shaping the sound of modern day innovators like Kamasi Washington and Kendrick Lamar. It changed music.

John Coltrane

It was such a great analogy, and it got me thinking about how it would apply to my other interests. Inevitably, my attention drew to my field — data. Data means a lot of things to a lot of people — reporting metrics, personal information, the ‘raw stuff’ needed to build machine learning algorithms, and so much more. But when we talk data, it’s typically in the context of what it’s measuring. We need to start talking about what it does.

Humans are collecting data. The volume we’re collecting is ever-growing, and measures every aspect of daily life. But data’s utility is directly connected to its ability to influence an outcome.

We’re getting very, very good at using data in software. Software automates and optimises repeatable flows of information for very specific outcomes — this is Artificial Narrow Intelligence, and it uses data to do very specific things very well. Looking at a single trip in an Uber, we can frame the language of data in two ways — in either the data points passed between parties, or in the outcomes they influence.

A typical Uber experience

Option 1 ::: Datapoints
When you open the Uber app, data about available drivers appears. Data is crunched to determine how long it will take for a driver to get to you, given the number of drivers and other people in the area also looking for a car. When you book the car, its position, the license plate, the name of the driver, the driver’s rating, and several other details pop up on your screen. When you get in the car, you can share your location with others, you get an estimated time of arrival, as well as a recommended path. When you end the trip, you get a bill itemising your trip, and you get to submit even more data to rate your driver.

Option 2 ::: Decisions
When you open the app, you can find out if an Uber will arrive in an acceptable timeframe. When you book the car, you’ll learn when and how to find the driver. When you get in the car, you’ll know if you’ll make your destination in time. When you end the trip, you decide if a trip was worth the price, and make sure future passengers know how good the driver is.

The data points are only relevant in the context of the decisions. There’s a lot of data in the back-end that the people at Uber aren’t showing you — because it doesn’t aid the user’s perspective of the journey. Large tech businesses rely on exabytes of data to influence users, but may only surface one or two of them. They don’t let users see the thermometers, they just turn up the temperature. This is particularly true in online advertising, where billions of dollars are spent to make sure you see those shoes exactly when you need them.

Non-tech businesses have progressively built their automation capability. Data Science teams work with large datasets to optimise customer engagement. Manufacturing businesses use data signals from machines to predict and prevent failure, and HR teams use surveys to identify high performers using text analysis. The list goes on, but each of these involve taking data and solving a very narrow, very well defined problem. That is — do we turn on the thermostat, or leave it off?

If nothing changes as a result of these processes, they’re ultimately pointless.

When the outcome is not well defined, data fails to maximise its utility. One area that highlights this painfully well is in Data Insights.

A data analyst with his captivated audience.

Data Insights

When it comes to data, many organisations are filled with people hungry to get value out of available data. Business people have seen the tech giants use data at unprecedented scale to disrupt industries, and want to get in on the action. Capable Insights and Analytics teams earnestly work with these business people to solve these problems. Unfortunately, they’re not always tackling the problem in the right way.

It seems straightforward. I have a thing, I have lots of data about the thing, therefore someone should tell me something useful about the thing. But the issue here is — nobody’s highlighted a problem. Imagine this approach applied to the Uber example — “tell me something about the Ubers in the area” could result in a lot of data points being thrown at a user, but people don’t care about the data, they care about the problem it solves. “I need to get out of this date, how much longer do I have to hear about this guy’s Bitcoin investment?” is a problem with a mercifully clearer solution.

Analysts often fall down the rabbit hole, answering every possible question. When it comes time to discuss the analysis, they’re often met with “that’s all interesting, but here’s my big problem right now”. Data analysis as a vehicle for satisfying mild curiosity is counter productive, as it perpetuates the myth that data is a ‘nice to have’ when making big decisions.

“20 years of schoolin’ and they put you on the day shift…”

The very same week A Love Supreme was released, Bob Dylan recorded Subterranean Homesick Blues, featuring the classic line “You don’t need a weather man to know which way the wind blows”. This often applies to data insights. An analyst may collect data for a dashboard or a presentation about a subject matter, but without a defined problem it’s just a bunch of measurements the audience is probably already aware of.

To resolve this problem, analysts need to work much harder on showing others how to ask questions. It’s up to the analyst, because they’re ultimately the person who’ll need to work with the data. And if they don’t encourage a narrower line of thinking, they’ll be doomed to repeat the same mistakes — that is, someone will ask a broad question, the analyst will attempt to answer it, and they’ll be hit with the age old question — “So what?”.

Nobody has time for a bunch of charts, tables, and numbers unless they tell them something useful, something that will influence an outcome. Something that will turn up the heat.

--

--