featurepreneur
Published in

featurepreneur

What is Auto-ARIMA?

Auto_arima, a routine from IMSL, applies automated configuration tasks to the autoregressive integrated moving average (ARIMA) model. Before getting into it, we have to get a fair understanding of ARIMA Models.

What is ARIMA:

An autoregressive integrated moving average, or ARIMA, is a statistical analysis model that uses time-series data to better understand the data set or predict future trends. A statistical model is autoregressive if it predicts future values based on past values. For example, an ARIMA model might seek to predict a stock’s future prices based on its past performance or forecast a company’s earnings based on past periods.

We can split the Arima term into three terms, AR, I, MA:

AR(p) stands for the autoregressive model, the p parameter is an integer that confirms how many lagged series are going to be used to forecast periods ahead.

I(d) is the differencing part, the d parameter tells how many differencing orders are going to be used to make the series stationary.

MA(q) stands for moving average model, the q is the number of lagged forecast error terms in the prediction equation. SARIMA is a seasonal ARIMA and it is used with time series with seasonality.

Auto ARIMA:

For a while, we have been going through the process of manually fitting different models and deciding which one is best. So, we are going to automate the process. Basically, it takes the data and fits many models in a different order before comparing the characteristics. However, the processing time increases substantially, when we try to fit complex models.

How does it work:

In Time Series Analysis, a correct model should yield the highest Log-Likelihood and requires the lowest AIC.

The AIC and BIC of a model depend on the log-likelihood, to be more precise AIC and BIC uses Log-Likelihood in their formula. This would indirectly imply that the log-likelihood is high therefore this method cycles through the model with varying orders and specifications and out of those returns the one with the lowest.

Formula of Log-Likelihood
Formula of AIC
Formula of BIC

Pros:

  • It saves an enormous amount of time
  • eliminate the need to understand the statistics and theory behind the model selection.
  • this method will also reduce the risk of human error and the possible mistakes caused by an incorrect interpretation of the results.

Cons:

  • Blindly putting our faith into one criterion
  • Never really see how well the other models perform
  • Topic expertise

Check out my Time-Series-101, which covers all the major concepts in Time Series Analysis. Keep Exploring!

--

--

--

Microprediction/Analytics for Everyone! We help volunteers to do analytics/prediction on any data!

Recommended from Medium

How to automatically process data with AWS fargate when upload on AWS S3

HyperParameter Tuning and Its Types

On data science and accessibility

X-rays of my leg showing a broken ankle and fractured fibula

Test Driven Development In Data Science

A Tutorial Using Spark for Big Data: An Example to Predict Customer Churn

NEWA Meso-Micro Challenge Phase 2: Complex Terrain

On Average, the Average Average isn’t as Average as the Average Person Thinks…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Eswara Prasad

Eswara Prasad

More from Medium

To predict the sales of weather sensitive products of Walmart during major weather events using…

How to interpret sMAPE just like MAPE

Comparison between MAPE and sMAPE metrics

Multivariate Time Series Forecasting using XGBoost

Application of Mixed Integer Quadratic Programming (MIQP) in Feature Selection