Machine Learning for Sales Forecast

Nishant Rastogi
Firing Neurons
Published in
5 min readJul 29, 2019

Sales Forecasting is about estimating the sales that are yet to happen. Accurate sales forecast are essential for the success of any business. When done correctly, sales forecast can be used as an input to optimize several other business functions like manufacturing, inventory, supply chain, finance etc. The process of sales forecasting is about looking into the data from past and making predictions about the sales in future.

While it appears a simple process at the surface, it really isn’t. This is because the circumstances that resulted in some sales in the past may not necessarily remain the same and vary in future. It can be even more challenging if there’s no historical data available, for example, forecasting sales for a new product lineup. In such cases, we resort to the sales of similar products or products belonging to the same categories as reference and make predictions about the sales of the new product. Another problem can be missing data or presence of some unreasonable trends in the past sales. Statistically, this is also called as the presence of outliers in the data which is fairly common to happen in all realistic scenarios because hey, nothing always goes as planned.

In machine learning, the traditional way of looking at sales forecasting is as a time series problem, with trend, seasonality and randomness being the main components. Trend is a general direction in which something is heading like we say that the price of a certain stock is trending upwards or downwards. An upward or positive trend doesn’t necessarily mean a daily increase in the value but it tells about the direction on an average. Seasonality is about having predictable changes at regular intervals or at specific seasons. A change is considered to be seasonal when it repeats at a desired interval of time. The last key component of a time series is randomness that covers all unpredictable changes. This is because of the external factors affecting the possibility of any outcome. Since it is not practically possible to take into consideration all the possible factors while making predictions, hence they collectively contribute to the randomness.

A simplest way of using time series for sales predictions would be by performing a uni-variate time series analysis with only information required for analysis are “dates of sale” and the “sale”. This is called uni-variate as there is only one time-dependent variable. While a uni-variate time series is not highly accurate as it does not consider other factors that could possibly be affecting the sales, it is a quick way to get estimates with some accuracy. This is usually helpful when historical data of past several years is available which ensures that the events affecting sales would have repeated several times during that time frame and their essence can be captured through trend, seasonality and randomness.

A multivariate time series analysis has more than one time-dependent variables. The core belief for a multivariate time series analysis is that sales at any given point in time is not only dependent on the past sales, but also on several other factors like the product demand, direct and indirect competition, marketing strategy, relative state of economy, customer base, weather etc. These are only some of the many possible factors that can have an impact on the sales. These factors also depend on the industry and sector to which a product belong.

There are several different models available today which can be used for time series analysis. Some of them are Holt-Winters, ARIMA, SARIMA, SARIMAX, GARCH etc.

Another way of looking at sales forecasting is as a regression problem rather than a time series problem. Regression algorithms detect patterns in the data and establish relationships between independent variables (factors affecting sales) and helps in determining the target variable (sales). While the regression algorithms are pretty effective, they come with their own limitations. One of the key assumption of regression algorithms is that the patterns present in the past data will repeat in future.

There are a number of regression algorithms available. The simplest of them all is linear regression. It is easy to understand & explain and works well when there’s a linear relationship between the independent and dependent variables. Another highly interpretable algorithm is decision tree which works as a flow chart with decisions being made at each node in the tree. Model ensembles like random forest, gradient boosting or xgboost work great and make predictions with high accuracy. However, these are complex models and lose explicability.

Unlike time series models which require a lot of data, generalization with regression models helps is making predictions with significantly less historical data. This usually helps for the scenarios when a new product or store is being launched. Another key difference between time series analysis and regression methods is in the way in which the models are tested or validated. In regression methods, the training data is divided into k equal parts out of which k-1 parts are used for training and 1 part is used for validation. This process is repeated k times with a different part of data kept aside for validation each time. This is called as k-fold cross validation and it helps in determining how good a model is going to perform on the unseen data. It is to be noted that cross validation is not a performance improving technique but a method of performance evaluation. K-fold cross validation is not used with time series due to the sequential nature of time series. Hence training and validation sets are created by splitting the historical data into sequential subsets in the order of time.

To summarize, traditionally sales forecasts were looked upon as a time series problem however the use of regression approaches often give better results with less data required for model training. Models like linear regression and decision trees comes with high interpretability and explicability and the results achieved using these models are easy to explain to business users. Model ensembles can be used for the scenarios where prediction accuracy is more important as compared to explicability.

--

--

Nishant Rastogi
Firing Neurons

Experienced Engineering Leader with nearly 20 years of expertise in developing and delivering data and analytics solutions for global organizations.