Predicting Annual Water Usage in Baltimore using ARIMA

Time Series Forecasting Walkthrough

Data Analysis

Augmented Dickey-Fuller Statistic

Before modeling we need to make sure the time series is stationary (check out why). One way to check for stationarity is to apply an Augmented Dickey-Fuller (ADF) Test. This is a hypothesis test where the null hypothesis states that the time series is not stationary and the alternative hypothesis states that it is stationary.

Differencing

Differencing is when we subtract a value from the previous value. This is a useful technique because it takes away some of the trend and seasonality of the time series.

ARIMA Model

Now that we did our data analysis, we can start modeling. The most popular model used for time series is ARIMA. It is an acronym for AutoRegressive Integrated Moving Average. To run this model is very simple. The main parameters in ARIMA are p, d, q. The p is the number of lag observations. The d is the number of times that the data is differenced. The q is the size of the moving average window.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store