Art of Analytics Modelling

Eram Khan
Nerd For Tech
Published in
3 min readMay 5, 2024

In the modern world, almost all companies have access to massive amounts of data. Very rarely is there a genuine lack of data to solve a problem. Often, the major problem is a lack of information deduction. According to surveys by BaseOne and Oracle, only 20% of companies feel they have fully utilized the potential of data analytics, while 82% feel paralyzed in their decision-making process due to data complexity.

My experience as a data analyst has taught me that it is often possible to represent a problem statement in a mathematical model. Analytics modeling is all about converting a real-world problem to a mathematical equation. It is as much an art as a science.

Before writing a single line of code, it is essential to have absolute clarity on the following:

  • What is the current scenario?
  • What do we want to ultimately achieve?

For example, let’s consider a relatively new stock broking app called StockBuy. It is generating leads, but their conversion rate (leads turning into paying customers) is lower than expected. They suspect their marketing budget is spread too thin across channels, leading to inefficiencies.

In the situation above, the current scenario is that the ROI of marketing is less than expected. We want to achieve maximum conversion at the minimum marketing cost. The gap here is that StockBuy is not sure which mix of channels they should select. We are looking to optimise budget allocation on different marketing channels to achieve the highest conversion.

Step 1: Represent Real-Life Situation in Math

Total customer conversion is a function of marketing budget allocation across all channels. Organic growth is also a factor; however, it should remain similar across different budget allocation options and can therefore be ignored for this discussion.

Total Conversion ∝ Budget allocation in (TV Ads, Social Media Platforms, Digital Banners etc)

Step 2: Analyze the Math

This problem can be converted into a multiple linear regression model to estimate how to distribute the budget across these channels.

Here, we have:

Independent Variable y: Total Conversion (Customers Acquired / Total Cost)

Dependent Variables: X1: Cost spend on TV Ads

X2: Cost spend on Social Media Platforms

X3: Cost spend on Digital Banners

Step 3: Turn Math into Real Life

The actual budget allocation can now be decided based on the coefficients determined by the model. A higher coefficient indicates a more important channel. This is a simplified example, and in reality, budget optimisation may be achieved in multiple stages like Cross-Channel marketing, Customer Segmentation, Vendor pricing, etc.

The best part about data science is that there is more than one right answer, and you can build solutions of varying complexity based on time and resources. Model selection is a balancing act! There is no one-size-fits-all approach. However, there are some thumb rules that can help.

Here is some information on popular models.

The flowchart below will guide you through selecting an appropriate model based on the type of problem you’re trying to solve.

Are you looking to classify data into certain categories?

Classification Models

Are you looking to make predictions or understand impact of certain factors on continuous variables?

Regression Models

Note: This flowchart is a basic guide. Other factors like data size, computational resources, and desired accuracy can also influence model selection.

Additional Options:

  • Bayesian Regression: A probabilistic approach to regression that can incorporate prior knowledge.
  • Neural Networks & Deep Learning: Powerful for complex relationships but can be computationally expensive and less interpretable.
  • Reinforcement Learning: Used for training models to make decisions in an environment through trial and error.

--

--