Time Series Analysis & Forecasting Models

Manthan Bhikadiya đź’ˇ
CodeX
Published in
8 min readApr 16, 2023

Time series analysis is important because it allows us to analyze and understand data that varies over time. Time series data can be found in many different fields, including economics, finance, engineering, medicine, and social sciences. Time series analysis is a powerful tool for understanding and analyzing data that changes over time. By identifying trends, forecasting future values, and understanding seasonality, time series analysis can provide valuable insights into many different fields.

Time series forecasting involves using past data points to make predictions about future data points. Time series data is a type of real-world dataset, alongside other types like images, audio, and text. Although time series data may appear similar to tabular data, it cannot be handled in the same way due to its unique characteristics.

Basics

Time series can be Stationary or Non-stationary.

Stationary Time Series: To be a Stationary time series should follow the below conditions.
(1) Mean should be Constant.
(2) Standard Deviation should be constant.

Non-Stationary Time Series: If a time series has not met the conditions of stationarity then it will be considered a Non-Stationary Time Series. The data is considered non-stationary if there is a strong trend or seasonality observed from the data.

Trend: A trend is basically time series that follows a long-term upward (Positive Trend)or downward (Negative Trend) direction then it

Seasonality: Trends will appear on a cyclic basis then we can consider that the time series has some seasonal pattern. For eg., Ice Cream Sales Analysis will follow a seasonal pattern. In between June & July, we can see the sales are higher than in other months because it's the summer season at that time.

White Noise: It is a special type of time series that has centered around zero, constant variance, and no autocorrelation between data points.
A time series can be considered stationary if it is white noise, but the opposite is not true.

Now let’s understand the most basic forecasting model called The Auto-Regressive Model.

The Auto-Regressive (AR) Model

In the time series, we make the assumption that previous values have an impact on the current price, and we base our attempt to forecast future values on that assumption. We refer to a model as being auto-regressive if it uses previous data to forecast future values. We consider past one value to predict the future value is called the simple AR(1) model. Lag is the number of earlier values taken into account by the model.

The AR model is appropriate when the time series data exhibits autocorrelation, meaning that past values of the series can help predict future values.

365 Data Science

This model performs well with Stationary time series and performs poorly with Nonstationary time series.

The Moving Average (MA) Model

It has also been found that if residual (error terms) are taken into account instead of past values of time series, we can also get an improved prediction of the future value. This strategy is used in the MA model. Rather than using past values of the forecast variable in a regression, a moving average model uses past forecast errors in a regression-like model.

The MA model is appropriate when the time series data exhibits moving average behavior, meaning that past forecast errors can help predict future values.

365 Data Science

This model performs well with Stationary time series and performs poorly with Nonstationary time series.

In the context of the fact that adding more lags makes the model more complex and requires the calculation of more coefficients, the model is likely not significant (Predicted values have high errors). The most crucial phase in creating a time series model is determining the number of lags. To identify a significant number of lags, we can use the ACF (Auto Correlation Function) and PACF (Partial Auto Correlation Function) functions.

ACF (Auto Correlation Function)

It evaluates the similarity between the current time series and its lagged version of the time series. ACF value starts with zero lag value which is the time series correlation with itself which is always 1. With ACF, all-time series value correlations with respect to a specific lag version are taken into account. Hence, when calculating, the correlation between the time series for lag value 2 then it will also consider the correlation value for lag value 1.

Stack Exchange

Typically, we’ll calculate ACF for a maximum of 40 lags. The blue area is considered to be not significant. We won’t take the ACF value into consideration if it is located within such a range. ACF values for lags 1, 2, 3, and 5 are significant in the example above, so we could use that lag values to build time series forecast models.
The Moving Average (MA) model order/lag value depends on the ACF value.

PACF (Partial Auto Correlation Function)

The partial autocorrelation function is similar to the ACF except that it displays only the correlation between two observations that the shorter lags between those observations do not explain. For example, the partial autocorrelation for lag 3 is only the correlation that lags 1 and 2 do not explain. We can think of PACF as the direct relationship between a time series and its lag variant.

Stack Exchange

We will calculate PACF for a maximum of 40 delays, just like ACF. We can use the PACF values for lag value 1,5 in the example above because it is considered to be important, and we can use that measure of lag to create time series forecast models.
The Auto-Regressive (AR) model order/lag value depends on the PACF value.

The Auto Regressive Moving Average (ARMA) Model

If we combine Auto-Regressive(AR) and Moving Average(MA) models then the resultant model is called the ARMA model. An ARMA model combines both autoregression and moving averages, making it suitable for time series data that exhibits both autocorrelation and moving average behavior.

If you have time series data that exhibits both autocorrelation and moving average behavior, an ARMA model would be the most appropriate choice.

365 Data Science

ARMA (Autoregressive Moving Average) models are generally used for Stationary time series data and do not perform well on Non Stationary time series data. We have to additionally transform the Non-Stationary time series into Stationary time series. Alternatively, we can use the Extended version of the ARMA model which is the ARIMA model.

The Auto-Regressive Integrated Moving Average (ARIMA) Model

The “Integrated” component of the model refers to the differencing step that is used to transform a non-stationary time series into a stationary one. The ARIMA model involves an additional step taking the difference between consecutive observations in the time series than the ARMA model.

365 Data Science

The Auto-Regressive Integrated Moving Average with Explanatory Variables (ARIMAX) Model

ARIMAX is a time series model that incorporates additional explanatory variables into the ARIMA model. These explanatory variables are also referred to as “covariates” or “regressors”. “X” refers to the inclusion of explanatory variables. ARIMAX is a time series model that combines the traditional ARIMA model with additional explanatory variables to make more accurate predictions.

365 Data Science

Models with Seasonal Components

SARIMA (Seasonal Autoregressive Integrated Moving Average) and SARIMAX (Seasonal Autoregressive Integrated Moving Average with Exogenous Variables) are time series models that extend the functionality of ARIMA and ARIMAX models, respectively, by incorporating additional seasonal components.

Seasonality refers to the presence of repeating patterns or trends in the data that occur at regular intervals, such as daily, weekly, monthly, or yearly cycles. These patterns can have a significant impact on forecasting accuracy, and failing to account for seasonality can lead to incorrect predictions.

By including these additional seasonal components, SARIMA and SARIMAX models are able to capture more accurately the patterns and trends present in time series data that follow a seasonality pattern.

The Autoregressive Conditional Heteroskedasticity (ARCH) Model

ARCH models are used to capture the volatility or uncertainty in time series data that may not be constant over time. The ARCH model assumes that the variance of a time series is conditional on the past values of the series itself. ARCH models are typically used when there is evidence of heteroskedasticity(the variance of the error term changes over time) in the data.

365 Data Science

It’s important to note that while ARCH models are useful in modeling heteroskedasticity in time series data, they are not appropriate for all situations. If the data is already homoskedastic (the variance of the errors is constant over time).

The Generalized Autoregressive Conditional Heteroskedasticity (GARCH) Model

ARCH model's main limitation is that they only capture the short-term persistence of volatility, meaning that the impact of past shocks on current volatility decays quickly over time. GARCH (Generalized Autoregressive Conditional Heteroskedasticity) models are an extension of ARCH models that address this limitation by allowing for the long-term persistence of volatility.
GARCH models introduce an additional term that captures the impact of past shocks on future volatility, allowing for a more accurate representation of the persistence of volatility over time.

365 Data Science

Conclusion

This was just an overview. If you want to learn in-depth about these models and want to do some hands-on in Python, I recommend you to check out Time Series in Python Course by 365 Data Science.
You can also check out the Time Series Analysis Playlist by Rivikmath on Youtube.

Thank you for taking the time to read this article! If you found it valuable, please show your appreciation by clicking the clap đź‘Ź button as many times as you can. Your support means a lot to me.

--

--

Manthan Bhikadiya đź’ˇ
CodeX
Writer for

Beyond the code lies magic. 🪄 Unveiling AI's potential with Generative AI, ML, DL, NLP, CV. Explore my blog's insights!