Time series forecasting (Part 1 of 3): Understanding the fundamentals

Yasmin Bokobza
Data Science at Microsoft
9 min readMar 24, 2021

By Yasmin Bokobza and Siddharth Kumar

Forecasting has fascinated people for thousands of years, since the time of prophets, seers, and oracles in different cultures who tried to predict the future to influence or lead people. In modern times, the importance of forecasting has continued to grow because almost every business needs to predict the future to make better decisions and allocate resources more effectively. Today, machine learning is often used for forecasting because prediction models can utilize a large amount of data and uncover previously unnoticed correlations.

In the Microsoft Cloud+AI Customer Growth Analytics (CGA) team, we have encountered a range of data with varying characteristics that need business forecasts. Here are some examples:

  • Daily, weekly, or monthly observations, sometimes with less than a month of history
  • Multiple types of seasonality
  • Missing observations or large outliers
  • Historical trend changes, for instance due to customer growth
  • Low latency constraints

To address these challenges, as well as others, we developed a generic machine learning–based time series forecasting algorithm called “Adaptive Univariate Time Series” (AUTS). This algorithm can be tuned easily, has low latency, and provides completely automated forecasts. In addition, AUTS deals well with outliers, historical trend changes, lack of historical data, and varying seasonality such as weekly, monthly, and so on.

In the CGA team, we use AUTS to anticipate customer needs based on how they use our services so we can allocate resources for the future, help qualify customers for the best pricing, deliver the most value for their investment, and so on. Customers can also use these tools themselves to understand their usage, gauge future usage scenarios, and help manage their cloud computing budgets.

In this first article of a three-part series about forecasting, we explore various forecasting methods and models, discuss the capabilities of the univariate forecast engine that we’ve developed, discusses details of our AUTS methodology — including how we deployed the model using Microsoft Azure — and provide guidance for you to consider in tackling your own business problems. In Part 2, we present our approach to algorithm selection by walking through the capabilities of the Univariate Forecast Engine that we developed, and we present how we enabled stakeholders, who are not necessarily data scientists, to move quickly on their time series problems and make high-quality forecasts. In Part 3, we discuss approaches to time series forecasting with an emphasis on what led us to develop the Adaptive Univariate Time Series (AUTS) algorithm for the forecasting tasks we have encountered, and delve into details of the AUTS methodology, including how we deployed the model using Microsoft Azure.

How to choose the right forecasting method

Time series forecasting operates in a well-defined problem space and expands across different domains. Producing high quality forecasts is not an easy problem, but businesses that do it well have advantages over those that don’t. (See Forecasting: Principle and Practice for more.) To deal with the challenges of producing business forecasts at scale, a useful forecasting procedure must possess the ability to be tuned easily, be relatively fast, and provide completely automated forecasts. In addition, such a forecasting procedure should handle with ease common features of business time series, such as seasonality across multiple periods, shifts in trend, and outliers.

To handle the increasing variety and complexity of forecasting problems, many forecasting techniques have been developed over the years. Each one has its special use, and so care must be taken to select the correct technique for a particular application. A deep understanding of the range of forecasting methods available increases the possibility that a specific application will bear fruit.

The selection of a method depends on many factors — the context of the forecast, the relevance and availability of historical data, the degree of accuracy desirable, the time period to be forecast, the cost and benefit of the forecast, and the time available for making the analysis. Here are four questions to ask to help you determine which forecasting method is best for your business:

1. Is historical data available? Because time series forecasting is quantitative, having data about the past is a must. The appropriate forecasting method depends largely on the type and amount of data available. Some methods, such as exponential smoothing or neural network, require more historical data than others.

2. What is the time series type — univariate or multivariate? The time series type depends on the type of data available or desired for use in the forecast. Understanding the time series type is important for choosing the forecasting method. There are two types of time series: 1) a univariate time series is one with a single forecast (dependent) variable and single explanatory (independent) variable, and 2) a multivariate time series is one with a single forecast (dependent) variable and more than one explanatory (independent) variable. After understanding the time series type, we can focus on the relevant forecasting methods.

3. What is the purpose of the forecast — how is it to be used? Defining the business problem to be solved requires an understanding of the way the forecasts will be used, who the stakeholders are, how the relevant market works, and the nature of the customer base. These factors, in turn, determine the accuracy and power required of the methods, and hence govern the selection. Methods vary in their costs, as well as in scope and accuracy. The level of toleration to inaccuracy must be determined so the forecaster can determine the trade-off of cost versus the value of accuracy in choosing a method. The more sophisticated forecasting models often produce results with a smaller error but the cost of implementing and maintaining them tends to be high. A decision on this trade-off must be reached between the choice of model and the cost, as shown in Figure 1 (from the article “How to Choose the Right Forecasting Technique”).

Figure 1: Cost of forecasting versus cost of inaccuracy (from “How to Choose the Right Forecasting Technique,” by John C. Chambers, Satinder K. Mullick, and Donald D. Smith, Harvard Business Review, July 1971)

4. What are the patterns of the time series? In order to fit the best forecasting method to our problem, we need to identify the time series patterns in the data and then choose a method that is able to capture the patterns properly. Time series patterns can be described in terms of three basic classes of components:

Trend: The long-term direction (increase or decrease) of a time series.

Seasonal: The regular pattern of variability within certain time periods, such as a year.

Irregular: The variability that is contained within a process that cannot be determined, caused by irregular and unpredictable changes in a time series.

There are common methods to decompose the time series, like Seasonal and Trend decomposition using Loess (STL), that help to extract these components and uncover the underlying patterns of the time series. You can find more information on decomposition methods in the Hyndman book listed as recommended reading at the end of the article.

Because time series forecasting is quantitative, we must make sure that it is reasonable to assume that some aspects of the past patterns will continue into the future. Figure 2 presents an example of time series decomposition from the data into the three components:

Figure 2: Example of time series decomposition from the data into the three components: trend, seasonal, and irregular

How can forecasting be made simpler?

Exploring and adjusting the historical data can often lead to a simpler forecasting task and more accurate results. As a result, it is important to first plot the data. This may help to identify whether there are consistent patterns, the existence of outliers, the strength of relationships among the variables available for analysis, and so on. These insights will help in preparing the data for forecasting and in choosing the appropriate preprocessing steps.

Optional pre-processing steps can be divided into three main actions:

  1. Cutting historical data. In some cases, older data is less useful, such as due to structural changes in the system being forecast. In these cases we may choose to use only the most recent data. By reducing the amount of data to analyze, the entire process can be simplified. However, care should be taken to ensure that useful data is not thrown away unnecessarily. (See Forecasting: Principle and Practice for more information.)
  2. Making data adjustments. Certain data adjustments can simplify patterns in the historical data by removing known sources of variation or by making the pattern more consistent across the entire data set. Adjusting the historical data usually leads to more accurate forecasts. Examples include calendar adjustments, population adjustments, mathematical transformations, and more. You can find more information on data adjustments in the Hyndman book listed in the recommended reading list at the end of the article.
  3. Treating outliers. Outliers are observations that differ significantly from the majority of observations in the time series. Outliers, also known as discordant observations, introduce bias in the model parameter estimates and can strongly influence the prediction’s accuracy. In addition, the existence of outliers may complicate the forecasting task because there are forecasting methods that do not work well with outliers. They can be removed or replaced with an estimate that is more consistent with the majority of the data. But the approach to take depends on the source of the outliers, i.e., whether the outliers were accidentally caused, for example, by incorrect data entry, or whether they are truly unusual observations. This shows how becoming familiar with your data prior to performing an analysis is of vital importance. There are several widely used time series outlier-detection methods for removing global and local anomalies. The methods selected depend on the type of forecasting problem and the data itself. Below we provide recommended reading for outlier detection techniques in a time series context.

List of open-source Python packages for time series forecasting

There are multiple Python packages that implement various statistical methods and algorithms within the forecasting framework. There are a few popular R packages as well, such as Prophet by Facebook.

Below is a summary of a few popular open-source Python packages or toolboxes for performing forecasting and time series analysis. The table below summarizes supported models or methods by packages or GitHub links. Please be aware that summary statistics represent a snapshot from February 2020 that is expected to evolve over time.

We advise you to consider the recency and activeness of maintenance of the packages when choosing the most appropriate one for a given problem. Most of the toolkits are flexible and reusable for various contexts and scenarios, and data scientists will find them easy to pick up even with limited or no background in time series forecasting.

Conclusions

In this article, we reviewed the fundamentals of time series forecasting and summarized a few popular Python forecasting packages to get started with. Many packages are super handy for those new to forecasting.

The next article in this series provides guidance around algorithm selection for your own problem. We explore the capabilities of the univariate forecast engine that we have developed and use repeatedly for all our univariate prediction problems. This engine helps us evaluate the accuracy of new forecasting models and choose the most accurate ones. In addition, we describe how we use the engine to serve our stakeholders — who are not necessarily data scientists — so they can move quickly on their time series problems and make high-quality forecasts.

We would like to thank Casey Doyle for helping review the work.

Recommended reading

Want to know more about how you can set up your Machine Learning projects for success? Read our recent article:

--

--