Why are we stuck in a ‘slice-and-dice’ mindset in analytics?

Kirill Shmidt
Wrike TechClub
Published in
4 min readMay 26, 2022

One of the most popular analytics methods is the slice-and-dice — take some metrics and categorize them. This is such a popular method that in fact, people mistake it for the analytics process itself. A typical example: look at retention or LTV by channels.

This is a really good method for solving simple problems. But, as often happens, many of the problems that are solved by this method have long been solved by other means.

But when the problem is a little more complex, the company and its analysts may come to a methodological dead end. For example, in order to find the answer to the question “which step in the product affects the conversion?”, people use the well-known slice-and-dice method only to find that they do not gain any clear answers.

When using slice-and-dice, I can’t formulate anything, it’s not clear how to classify behavior into categories, and it’s not clear how to take into account the influence of past steps. It is not clear how to combine different factors with each other. As soon as you break a metric into eight categories, you get 256 splits. Good luck looking at them on the Tableau and searching for “insight”. There is a feeling that, in order to look for an answer, you need to go somewhere different, somehow drawing conclusions in bulk.

How could one break any set of metrics into 1000 categories and find the right answers? How could you map the customer journey and find out which steps are important and which are not?

As soon as you ask such questions, immediately you are faced with the mother of all sciences — mathematics. Unfortunately, we do not often see analysts understanding that their problems can be solved by mathematics. More often, I hear talk about AB tests, Tableau, and Python. But I hear very little about regression, clustering, and causal inference.

There seems to be a feeling that the ultimate state of the art solution is to AB test — running millions of tests for every possible scenario, with everyone trying to find answers to their questions. But at the same time, the powerful method for searching for dependencies with a natural experiment is gathering dust. And if you don’t have millions of observations and an AB test machine, you’re in slice-and-dice territory, guessing for answers for your difficult questions. And if you move the AB test machine, then it turns out that the problems need to be solved by brute force and enumeration of hypotheses.

When you have the ability to run relatively cheap tests, that’s good. You can make mistakes along the way and learn how to be more efficient. But if you’re wondering why these methods don’t work, here’s a tip: play a game of Kerbal Space Program. This online game allows you to create and manage your own space program, building the spacecrafts, flying them, and help your Kerbals to conquer space. Run and see what happens — you will reach orbit relatively easily. But flying to the in-game Mars will not work for you. You will find out that you have a lot of hypotheses about what can be improved, but for some reason, this has little effect on the result. You find that you don’t have enough propellant in orbit to get to Mars. You increase the amount of fuel on the strat. Then you find that the rocket cannot take off. You change engines, get to orbit, and find yourself low on fuel anyway.

Why? Because you do not understand the laws that bind your plans and results. You seem to understand the direction, but completely misunderstand the scale. The problem is that it is not a linear law and there are many factors to take into account to be familiar with celestial mechanics.

In this sense, playing a million AB tests is the same thing. You can test yourself, but how do you find hypotheses? By instinct? By talking to 10 clients? How do you know before an AB test whether it can be successful or not?

But what else can we do? Nowadays, data science is focused on building accurate prediction models. We seem to have forgotten that models are a method for describing observations and dependencies. Regression is just such a model that explicitly gives you the significance of the factors used. We as analysts need to establish the “laws” of the firm, just as scientists need to establish the “laws of nature.”

The importance of Newton’s equations do not lie in the fact that you will get to orbit according to the parameters entered, but in the fact that you understand from which parameters orbit is obtained. In this sense, super complex neuron networks* do not help us, because they do not give us laws — they give us predictions (*Neuron networks can be decomposed into factors, but this is a story for another day).

If you want to make a change from analytics with the help of data, then you need to know your “laws”. You can then understand what you need to change in order to reach your goals.

This is the true state of the art solution — when your analytics department has discovered the “laws” of your product.

Photo by Campaign Creators on Unsplash

--

--