Stock Price Forecasting
Part 1— Creating a Time Series Model and Analysing it
Stock price behaviour is the classic example of a stochastic model. Although random enough, if you are lucky you may be able predict the price of a stock with much accuracy — that’s basically how markets function anyways.
However before we delve into forecasting, we will first need to understand how the prices for a given stock behaves in reality and be able figure out as many factors it depends on, as we can.
Here is an example of a stock price time series over a span of ~2 years:
Definitely not an easy plot to fit a curve into. Let’s see how we can make it easier.
Any time series comprises of the following components:
- Trend: the systematic component which increases or decreases over time.
- Seasonality: the systematic component which repeats over time.
- Noise: the non-systematic component in the data.
We can model any time series curve to be either an additive or a multiplicative model based on the above components.
An additive model → y(t) = Trend + Seasonality + Noise
A multiplicative model → y(t) = Trend * Seasonality * Noise
Whereas noise are present in all real-life time series, let’s look at how to decompose a time series to check for the existence for other two optional components.
But, how to decide between an additive and a multiplicative model?
- It’s better to opt for the multiplicative model when seasonal pattern increases or decreases as the data values increase or decrease.
- An additive model is better for cases when the seasonal pattern stays more or less constant as the data values increase or decrease.
- However sometimes the data pattern is not that obvious and we try both and choose the more accurate one.
Here is a code snippet I used to decompose the time series into its components:
We opt for multiplicative decomposition as the seasonality of the series in the above diagram increases over time.
As we can clearly see, the series seems to be heavily influenced by an upward trend. Other than that there seems to be no proper seasonality and the residual ( which represents noise) seems to vary a bit more with increasing time.
Stationarity Analysis:
A time series is said to be stationary if its statistical measures like mean, variance, etc. remain constant over time. Forecasting can be done easily on stationary series as you can formulate a higher probability on the expected value based on the constant statistical measure.
Below are the results of some of the classic tests for stationarity detection:
- Rolling mean:
Again this plot suggests us of an upward trend, however the variation seems to be in a constant range.
- ADF test results:
The Augmented Dicky Fuller test check for the presence of a unit root in a series. A unit root in a time series implies that the mean of the series is probably constant but its variance keeps changing without any pattern — and hence we can’t fit the curve to any equation, or in other words we have a series which is quite difficult to forecast.
- HO (null hypothesis): Time series has unit root.
- H1 (alternative hypothesis): Time series is stationary.
As the value of the test statistic < critical value at all confidence intervals, we reject the null hypothesis that the series has a unit root.
- KPSS test results:
The
- HO (null hypothesis): Data is stationary.
- H1 (alternative hypothesis): Data is not stationary.
As the value of the test statistic > the critical value at all confidence intervals, we reject the null hypothesis that the series is stationary.
Final Inference:
- ADF test: reject H0, and thus series is stationary
- KPSS test: reject H0, and thus series is not stationary
- As KPSS suggests series is not stationary and ADF suggests the opposite we can apply differencing to make this series stationary.
Making a time series stationary by applying differencing:
Here are the results after applying different differencing strategies on our time series data
First difference:
Differencing by weekly seasonality:
Log seasonal difference:
As there is not much difference in the seasonal and the log seasonal difference plots, we shall use the weekly seasonality differencing in the forecasting.
For forecasting, part 2 of this tutorial — please stay tuned!