Time series forecasting

Published in

Analytics Vidhya

9 min readDec 21, 2020

In this article we are going to see

What is Time series data?
What is Time series forecasting?
What are application of time series forecasting?
How to do time series forecasting by using different statistical forecasting techniques?

Note :I am going to discuss this topic more in Machine learning perspective than Statistical point of view.

What is Time series data?

“Time-series data” is a sequence of data points, measuring the same thing over time, stored in time order. Time Intervals at which data is recorded depends on task at hand.

For example: 1.Stock market movements are recorded at every millisecond time interval similarly, 2.Data from sensors in autonomous Vehicles needs to be collected and processed with zero latency. 3.Heart beat data may be recorded at every second. So what I mean to say is that time series data is dependent on time but time interval at which data is captured varies with different use cases.

Generally, A Time Series has equal spacing between two measurements that follow each other. Each time-unit, within the time interval, has mostly one data point. If data is recorded at different range of time intervals then that data may not be useful or we should make that data equally spaced in order to predict future outcomes

Types of time series data:

Univariate time series data
Multivariate time series data

Univariate time series data

The term “univariate time series” refers to a time series that consists of single (scalar) observations recorded sequentially over equal time.

Example: Temperature is single variable recorded over time .

Multivariate time series data

Multiple variables are recorded at any given time steps.

Example: Open ,High ,low and Close values in stock market data.

What is Time series data forecasting?

Time series data forecasting can be defined as predicting upcoming future values by looking at its previous recorded values at successive time intervals.

Forecasting data using time-series analysis comprises the use of some significant model to forecast future conclusions on the basis of known past outcome.

Lets say we want to predict Y at time (t ) which is by definition of time series forecasting depends on Y at (t-1) ,Y at (t-2) etc…so simple model equation becomes

Y (t)=Y (t-1)+Y (t-2)/(2)

Here in above equation ,Y is dependent on its own previous two value’s average so it can be called univariate time series forecasting. So here we only took previous 2 values to predict future value ,so our window size here is 2.Window size can be considered as a hyperparameter which means that we need to give different set of values to the model, which ever value gives best prediction that value will be chosen as our window size.

There are certain terms which we need to know before going forward in this article .They are

Trends
Seasonality
Cyclical Patterns

Every time series data when plotted in graphical manner will exhibit some kind of pattern. Patterns may be of different types and as a practitioner one needs to understand the patterns so as to get meaningful insights from the data.

Trends

A trend is a movement to relatively higher highs or lower lows over a long period of time. Higher highs that follow higher lows are defined as a uptrend, while lower lows followed by lower highs define a downtrend. If the movement doesn’t really produced any higher highs or lower lows we call that a horizontal trend. If we have for example a uptrend, statistically the trend is more likely to continue.

Seasonality

A Time Series that shows a repeating pattern at fixed intervals of time, within a one year period, is called a seasonal pattern or seasonality. Seasonal patterns can be spotted across many kinds of time series. A example is that your heating costs reduce in the summer and rise in the winter. Companies need to understand seasonality to properly manage their inventory, staff and many other important things.

In above graph the same pattern is observed yearly.

Cyclical Patterns

A pattern within data that includes rises and falls that are not of a fixed period, is called a Cyclical pattern. So a pattern that doesn’t occur within the same calendar year, is probably a Cyclical pattern. These patterns last several years and don’t have a repeating pattern within each year.

What are application of time series forecasting?

There lots of application where time series analysis is used.These are very few important use cases of time series forecasting.

Financial analysis
Weather analysis
Retail
Speech generation
Machine translation

You name it ,there will some kind or the other forms of time series prediction or forecasting will have it use cases.

How to do time series forecasting by using different statistical forecasting techniques?

There are several Statistical and machine learning techniques available for forecasting time series data but as said I will be sticking to basic statistical models in this blog.

Auto regressor (AR) model

Before applying any Statistical model we need our data to be stationary.

Stationary Data

A stationary time series is one whose properties do not depend on the time at which the series is observed. Thus, time series with trends, or with seasonality, are not stationary — the trend and seasonality will affect the value of the time series at different times.

Model will only work properly if certain properties (mean, variance, covariance), are identical in at any chosen time frame. If there’s a trend in your dataset, you must remove the trend (either by differencing or subtracting out the trend) before continuing.

Stationary data properties

have a constant mean
have a constant variance
have Covariance stationary

Autocorrelation

An autoregression model makes an assumption that the observations at previous time steps are useful to predict the value at the next time step.This relationship between variables is called correlation.

If both variable changes in same directions it is called positive correlation and if the variables changes in opposite directions it is called negative correlation. We can use statistical measures to calculate the correlation between the output variable and values at previous time steps at various different lags. The stronger the correlation between the output variable and a specific lagged variable, the more weight that autoregression model can put on that variable when modeling.

Lags:This is value of time gap being considered and is called the lag. A lag 1 autocorrelation is the correlation between values that are one time period apart. More generally, a lag k autocorrelation is the correlation between values that are k time periods apart.

ACF:

ACF is an (complete) auto-correlation function which gives us values of auto-correlation of any series with its lagged values. We plot these values along with the confidence band and tada! We have an ACF plot. In simple terms, it describes how well the present value of the series is related with its past values. A time series can have components like trend, seasonality, cyclic and residual. ACF considers all these components while finding correlations hence it’s a ‘complete auto-correlation plot’.

PACF:

PACF is a partial auto-correlation function. Basically instead of finding correlations of present with lags like ACF, it finds correlation of the residuals (which remains after removing the effects which are already explained by the earlier lag(s)) with the next lag value hence ‘partial’ and not ‘complete’ as we remove already found variations before we find the next correlation. So if there is any hidden information in the residual which can be modeled by the next lag, we might get a good correlation and we will keep that next lag as a feature while modeling. Remember while modeling we don’t want to keep too many features which are correlated as that can create multicollinearity issues. Hence we need to retain only the relevant features.

Auto regressor

An autoregressive model uses a linear combination of past values of the target to make forecasts. Of course, the regression is made against the target itself.

Mathematically, an AR(p) model is expressed as:

Here ,Yt is future output ,Yt-1 ,Yt-2…Yt-p are its previous p values

p: is the order
c: is a constant
epsilon: white noise

White noise : It plays crucial role in time series data .If a time series is white noise, it is a sequence of random numbers and cannot be predicted. If the series of forecast errors are not white noise, it suggests improvements could be made to the predictive model.

Moving average(MA):

In time-series, we sometimes observe similarities between past errors and present values. That’s because certain unpredictable events happen, and they need to be accounted for.

In other words, by knowing how far off our estimation yesterday was, compared to the actual value, we can tweak our model, so that it responds accordingly.

Let’s suppose that “r” is some time-series variable, like returns. Then, a simple Moving Average (MA) model looks like this:

rt = c + θ1 ϵt-1 + ϵt

What is rt?

For starters, rt represents the values of “r” in the current period — t. In terms of returns, it’s what we’re estimating the returns for today will b

What is c?

The first thing we see on the right side of the model is “c” — this stands for a constant factor. Of course, this is just the general representation and we would substitute this with a numeric value when we’re actually modeling data.

What is θ1?

Next, θ1 is a numeric coefficient for the value associated with the 1st lag. We prefer not to use ϕ1 like in the Autoregressive model, to avoid confusion.

What is ϵt and ϵt-1?

Then come ϵt and ϵt-1 which represent the residuals for the current and the previous period, respectively.

For anybody not familiar with the term, a residual is the same as an error term — it expresses the difference between the observed value for a variable and our estimation. In this specific case: ϵt-1 = rt-1 — r̂t-1 , where r̂t-1 represents our estimation for the previous period.

So, how do we generate these residuals?

It’s quite simple. We start from the beginning of the dataset r1 and try to predict each value (r̂2, r̂3, etc). Depending on how far off we were each time, we get a residual ϵt = rt — r̂t. Therefore, we generate these residuals as we go through the set and create the ϵ variable as we move through time (from period 1, all the way up to the current period).

Auto regressor Moving average (ARMA):

ARMA model is simply the merger between AR(p) and MA(q) models:

AR(p) models try to explain the momentum and mean reversion effects often observed in trading markets (market participant effects).
MA(q) models try to capture the shock effects observed in the white noise terms. These shock effects could be thought of as unexpected events affecting the observation process e.g. Surprise earnings, wars, attacks, etc.

ARMA model attempts to capture both of these aspects when modelling financial time series. ARMA model does not take into account volatility clustering, a key empirical phenomena of many financial time series which we will discuss later.

ARMA(1,1) model is:

x(t) = a*x(t-1) + b*e(t-1) + e(t

is e(t) white noise with E[e(t)] = 0

Conclusion

Time series analysis is basically the recording of data at a regular interval of time, which could lead to taking a versed decision, crucial for trade and so have multiple applications such as Stock Market and Trends analysis, Financial forecasting, Inventory analysis, Census Analysis, Yield prediction, Sales forecasting, etc.There are more complex Deep learning techniques like LSTM which can perform much better if we have lots of data. But in cases Simple AR and MA models out perform LSTM if the data is limited.