Where are my sales? — Part 1 Contd.

Syed Misbah
Data Decoded
Published in
5 min readMay 15, 2019

This article is a continuation of the first part of a 4-part series on sales driver analysis. Read the first part here.

I will break down this article into five parts for easy readability.

  1. Secondary drivers which affect sales
  2. Explaining and isolating your sales — Models and the math
  3. Understanding the waterfall chart
  4. Demand Forecasting using a driver model

Secondary drivers of sales

Competitor Landscape

Competition, while secondary, is a major driving factor of your sales. There are three forms of competition that you need to take care of.

  1. Inter Competition — Products from other manufacturers which are similar to yours
  2. Intra Portfolio — Your own products (of a different brand name maybe) which are cannibalizing your portfolio.
  3. Cross Retailer — Your products, sometimes priced differently across stores of different channels, might also impact the sales significantly. For e.g., sales of your candy bar in a small retailer might be affected because Walmart priced the same product at 50% OFF in the same week.

When building a model, price of your competitor products is typically considered as the “driver”. However, you may want to account for some other behavior of your competitors according to your use case. For e.g., if you’re aware that your competition fights you on media ads rather than price, you can choose to use your competitors’s media ad rating in your model.

Media and event advertisements

Some manufactures may opt to give media ads and/or hand out product samples. Hence, it becomes important to account for these things.

The only caveat while using data for ads/sampling is that the effect of these factors is felt in the long term. Additionally, the data for these is usually at a monthly/quarterly aggregated level, while sales are at a weekly level. These two caveats make it difficult for a mathematical model to capture their effects correctly.

Macro-economic factors

Macro-economic factors like unemployment, inflation, wages etc. may also affect sales. It depends on the target demographic of your product. For e.g., if you’re selling staple food items, they may not be affected drastically by a slowing economy. However, if you’re selling luxury items, it might be impacted by the same.

Weather and natural events

Weather also does slightly impact the sales. Again, this is dependent highly on the type of product you’re selling. A cold weather throughout the year will definitely decrease the sales of soft drinks; however, rice might not be affected much by the same. Hence, keep in kind your specific use case before using these factors in your model.

Explaining and isolating your sales

This is where the math comes in. Statistical models are used to break down the sales — the most common method used is some form of linear regression.

You can read about the technicalities of this model here. In short however, it tries to form an equation of this form:

Sales = Baseline + Effect of price + Effect of seasonality + Effect of distribution + ….

or to put it more mathematically

Units = β0 + (βPrice x Price) + (βSeasonality x Seasonality) + (βDistribution x Distribution) + ….

The β of the individual factors is what represents the effect of that factor.

Types of models

  1. Additive

An additive model assumes that the effect of each factor is independent of each other. Hence, this is the best model when it comes to explaining sales.

Unfortunately however, in the real world, effects are seldom independent. A 50% OFF given during a normal week will have less effect than a 50% OFF during a Christmas week. Hence, most of the times, we need a multiplicative model to be better able to predict sales.

2. Multiplicative

Because of the dependent effects of each factor and the peculiar relationship of price and unit sales, we need to transform prices and units to the logarithmic scale. This transformation converts a additive model shown above to a multiplicative one.

A multiplicative model is the most accurate model when it comes to predicting sales.

After you’ve created a statistical model using any of the two methods above, you get the effect of each factor as coefficients(β — Betas).

Understanding the waterfall chart

Sample Waterfall Chart

After we’ve isolated all the drivers and gotten coefficients from the statistical model, we can use a waterfall chart to understand what the effect of each driver is.

In the sample chart shown above, we can clearly see that the baseline is 6000 units — with (a lower ) price in that week contributing 4000 units, low structural demand of the week reducing it by 2000 units and so on.

The exact calculation of the individual contribution of each of the factors is tricky and requires some mathematical workarounds. We will be discussing this in a separate post at the end of this series.

Issues and workarounds with a waterfall chart

One of the major issues with the waterfall chart is that because it’s modeled using a multiplicative equation (Sales = Effect of Factor A x Effect of Factor B x Effect of Factor C), isolating effects is tricky and not a mathematically pure method. Additionally, if using a multiplicative model, two different week waterfalls cannot be compared because of the way the waterfalls are calculated.

As a workaround, we can use a additive model(Sales = Effect of Factor A +Effect of Factor B +Effect of Factor C) separately just for waterfalls. As explained earlier, this model will have a low accuracy compared to a multiplicative model, but very high(and accurate) explain-ability.

Demand Forecasting

Historically, demand forecasting is seen only from a time dimension — with the assumption that your strategies/tactics remain the same in the future.

However, CPG manufacturers have a slightly different perspective. They want to understand and forecast demand — in the light of changing strategies. For e.g. in 2018, Product A had 13 weeks of 50% OFF offer, while 39 weeks of no offer. They want to understand how the demand would look like if they were to change their strategy to 26 weeks @ 50% OFF and 26 weeks of no offer.

Basically, they want a combination of a demand forecasting framework combined with a scenario planner.

A possible approach

One way of doing this is to forecast the levers you want to control feed it to the driver model as input. Sounds greek? Let’s break it down into steps

  1. Forecast/Set the expected values of the driver you want to influence. For e.g., let’s say you want to change your pricing strategy in 2020. In this case, you can set the prices you want/expect for the 52(n) weeks.
  2. Using the equation from the statistical model(preferably multiplicative-because of higher accuracy), you can predict the sales of your product for next 52(n) weeks

This is one way of approaching the lever-effect demand forecasting problem. However, keep in mind that if you just want to forecast — without influencing levers, you should go ahead with a time series based model, which is likely to be more accurate and robust.

In the second part, we will be covering the data treatment and variable design aspect in a typical sales driver analysis pipeline.

Please do let me know your thoughts, questions or suggestions in the comment section below.

--

--