Published in

featurepreneur

# What is Auto-ARIMA?

Auto_arima, a routine from IMSL, applies automated configuration tasks to the autoregressive integrated moving average (ARIMA) model. Before getting into it, we have to get a fair understanding of ARIMA Models.

## What is ARIMA:

An autoregressive integrated moving average, or ARIMA, is a statistical analysis model that uses time-series data to better understand the data set or predict future trends. A statistical model is autoregressive if it predicts future values based on past values. For example, an ARIMA model might seek to predict a stock’s future prices based on its past performance or forecast a company’s earnings based on past periods.

We can split the Arima term into three terms, AR, I, MA:

AR(p) stands for the autoregressive model, the p parameter is an integer that confirms how many lagged series are going to be used to forecast periods ahead.

I(d) is the differencing part, the d parameter tells how many differencing orders are going to be used to make the series stationary.

MA(q) stands for moving average model, the q is the number of lagged forecast error terms in the prediction equation. SARIMA is a seasonal ARIMA and it is used with time series with seasonality.

## Auto ARIMA:

For a while, we have been going through the process of manually fitting different models and deciding which one is best. So, we are going to automate the process. Basically, it takes the data and fits many models in a different order before comparing the characteristics. However, the processing time increases substantially, when we try to fit complex models.

## How does it work:

In Time Series Analysis, a correct model should yield the highest Log-Likelihood and requires the lowest AIC.

The AIC and BIC of a model depend on the log-likelihood, to be more precise AIC and BIC uses Log-Likelihood in their formula. This would indirectly imply that the log-likelihood is high therefore this method cycles through the model with varying orders and specifications and out of those returns the one with the lowest.

## Pros:

• It saves an enormous amount of time
• eliminate the need to understand the statistics and theory behind the model selection.
• this method will also reduce the risk of human error and the possible mistakes caused by an incorrect interpretation of the results.

## Cons:

• Blindly putting our faith into one criterion
• Never really see how well the other models perform
• Topic expertise

Check out my Time-Series-101, which covers all the major concepts in Time Series Analysis. Keep Exploring!

--

--

## More from featurepreneur

Microprediction/Analytics for Everyone! We help volunteers to do analytics/prediction on any data!