Time Series Analysis 102

Prashant Bangar
Nov 8, 2020 · 8 min read

A deep dive into Forecasting

Image for post
Image for post
https://redesulconsorcios.com.br/previsao-para-o-cenario-economico-em-2019/

In Part 1 forecasting was briefly introduced as an useful technique to establish future baselines for time series data. Although reading simple forecasts seems straight forward, there are a number of practical nuances which need to be understood for correctly consuming these forecasts. This post will cover some of the most important caveats of the forecasting process. Following the precedent set by last post, even this discussion will be light on the coding part and focus more on decluttering the concepts.

How does forecasting actually work?

Let’s try to understand what happens behind the scene using an example.

We have the monthly sales data for a store for the past 2 years. Let X =[x1, x2, x3, x4…., x24] represent our monthly sales values. We can need to forecast the sales for the next 6 months.

Training Phase — Involves learning a function F(X) that outputs a future value given all past sales values. This function can simply be thought of as the forecasting model.

x+1 = F(x, x-1, x-2, …x1)

Prediction Phase — The learned function can then be used to calculate future sales values as follows,
x25 = F(x24, x23, …..x1)
x26 = F(X25, x24, … x1) and so on…

The training phase is a classic optimization problem where we learn the model representing our time series.

How is the forecasting accuracy measured?

The training set is used to learn the model. The learned model is then used to forecast for the test set period. The accuracy is then calculated using the actual and forecasted values for the test set. There are a lot of scores used but for the sake of understanding, the accuracy can be calculated by using the Mean Absolute Percentage Error as follows,

Image for post
Image for post

The accuracy calculated on an unseen data gives an idea about how the model would perform in the real world on the future data. The plot below shows forecasting process for the Airline Sales dataset. The black line represents the actual data. The Orange line represents the learned model and the red line represents the forecasts on test period.

Image for post
Image for post

What is the forecast Prediction Interval?

For instance recall our bike-sharing forecasts from the last post, the blue shaded region around the orange forecasts signified the 95% prediction interval. This means that based on the learned model it can be said that the actual future values will lie in the shaded region around the forecast with 95% confidence.

Image for post
Image for post

What are the different kinds of Models are used for Forecasting ?

Exponential Smoothening

Image for post
Image for post

This is a very simple form of the model for developing the intuition and in real life the data will be more complex and we will use more complex equation of smoothening. For instance even Excel provides a variation of smoothening in its forecast functionality. Let’s look at the more specific Triple Exponential Smoothening or Holt Winters Seasonal method.

The Holt Winters method consists of the forecast equation with 3 components one for each level ℓt, trend bt and seasonality st. The model has the form as follows,

Image for post
Image for post

Without going into much details of actual smoothening equations and their coefficients, the key takeaway is that this method learns the trend, level and seasonality separately and then combines them to generate forecast. One point to note is that each smoothening component will have a corresponding smoothening coefficients α, β and γ in its equation. The smoothening coefficients are learnt by the model during the training process and need not be provided. Following is a simple use-case of this technique for predicting the sales for airline passengers

Image for post
Image for post

Holt Winters method is generally is a fast and simple method to use for forecasting whenever your data has trend, seasonality or some patterns and there isn't much data for other techniques like ARIMA to work.

ARIMA

AR model — Auto Regression (Order p)

Integrated Differencing (Order d)

MV model — Moving Average (Order q)

Thus the ARIMA model is described using these three order (p, d, q). For the seasonal variation of the model, the same concepts are extended to the seasonal period and we get the seasonal order for the model (P, D, Q)

What is Differencing?

A time series is called stationary of its values don't depend on time. For example, random noise can be called stationary.

On the other hand any series having a trend or seasonality is non-stationary and its values are said to have auto-correlation. For example, if we consider the temperature values for a city, todays value is correlated in some way to yesterdays value and so on.

We can remove auto-correlation and make a time series stationary by differencing wherein we substract current value of time series by past value.
There are statistical tests to determine the order d of differencing (how many times differencing needs to be applied to make the series stationary).

The ACF plot is used to visualize the auto-correlation between the values of a time series. Following figure shows auto-correlations for a google stock prices. The first plot shows the auto-correlations for the original series is high and gradually decreases for past values. The second plot shows Auto-correlation after differencing which is reduced and the series is almost stationary. To figure out the order of differencing we generally choose the lag value with highest auto-correlation which in first plot is 1 and in second plot is 7.

Image for post
Image for post

What is Auto-Regression?

Auto-regression is regression for predicting future value using the past values of the stationary time series itself. What this means is the current value of the series is regarded as the target and the lagged past values are regarded as the regressors to build the regression model. The regression equation has the following form and number of components (past values) considered to build is called the order p of the model.

Image for post
Image for post

What is Moving Average?

In the moving average model, the regression like model is build on the forecast errors of the past values of the stationary time series instead of the values themselves. The following is its equation form where it can be seen that each value y(t) in timeseries is expressed as the weighted moving average of the past forecast errors. The error terms included in the model determine the order q

Image for post
Image for post

Following code snippet uses a basic auto-forecast project based on the ARIMA family of models for forecasting the Bike Sharing forecasts seen above.

Transformation

By going through some widely used techniques and the internals of the forecasting process this post makes consuming forecasts a little easier. However there are still a lot of details that have been skipped to keep this discussion small.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Sign up for Analytics Vidhya News Bytes

By Analytics Vidhya

Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Prashant Bangar

Written by

Data Science | Machine Learning | Time Series Analysis

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Prashant Bangar

Written by

Data Science | Machine Learning | Time Series Analysis

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store