# A gentle start in this series- time series and it’s analysis.

## 1. The first one 2. A gentle start in this series 3. Approximating a time series

This article and attached notebook is a gentle start to series where we cover prerequisites about time series from the introduction, imports, visualization, decomposition, random walks to quite a fair level of modeling using AR/MR/ARMA/ARIMA/VAR. This article is to brush on the time series concepts before we get into some topics I really want to cover.

For me, time series is simply “data which reflects additional information of relevance of time in its structure.”

To give a logical explanation a time series is a series of data having:
1. Start
2. End &
3. Frequency
of the data points itself.

Given this structure, we get a time series and the best way to understand time series is to first read it.

# But before we read our time series let’s first list some very basic libraries for time series

## NumPy

Numerical Python is a library used for scientific computing.

## Pandas

This library provides highly efficient and easy-to-use data structures such as series, data frames, and panels.

## SciPy

Science Python is a library used for scientific and technical computing.

## Scikit Learn

This library is a SciPy Toolkit widely used for statistical modeling, machine learning, and deep learning, as it contains various customizable regression, classification, and clustering models.

## Statsmodels

Like Scikit Learn, this library is used for statistical data exploration and statistical modeling.

## Matplotlib

This library is used for data visualization in various formats such as line plot, bar graph, heat maps, scatter plots, histogram, etc.

## Datetime

This library, with its two modules − DateTime and calendar, provides all necessary DateTime functionality for reading, formatting and manipulating time.

# Let’s talk a little about timestamp and periods.

To give a logical explanation a time series is a series of data having:
1. Start
2. End &
3. Frequency
the data points itself.

The 1. 2. and 3. define the timestamp. Timestamp represents a distinguishable point in time space. Period is the distance or interval between two timestamps.

## The notebook has introduced basic statistics for time series

-percentage change
-absolute change in successive rows
-comparing two time series

## A window is a fixed interval of time. Window function are used to identify sub-intervals of selected interval size.

use of prior time steps to predict the next time step is called the sliding window method variations of windows-

1. Rolling — Same size and sliding
2. Expanding — Contains all prior values

In simple words a window is like: we have pt1, pt2, pt3, pt4, pt5…. say we choose window of 3 points we select first three points(pt1, pt2, pt3) and find a mean of it as new time series point n_pt1. Move window by one and select three points( pt2, pt3, pt4) and find a mean of it as new time series point n_pt2 and so on to get n_pt1, n_pt2, n_pt3, n_pt4, n_pt5….

# Lets introduce some charts!

## Open-high-low-close Charts

Time series is infarct quite a common type of data for sales and manufacturing industry but the available ts-modelling tutorials and wide spread buzz is in financial industry. Given that Open-high-low-close Charts (or OHLC Charts) are used as a trading tool to visualize and analyse the price changes over time for securities, currencies, stocks, bonds, commodities, etc.

## Candlestick Charts

Another specific chart for analyzing price movements more or less like the box plot is a candlestick chart.

1. Each symbol represents the compressed trading activity for a single time period (a minute, hour, day, month, etc).
2. Each Candlestick symbol is plotted along a time scale on the x-axis, to show the trading activity over time.
3. The main rectangle in the symbol is known as the real body, which is used to display the range between the open and close price of that time period. While the lines extending from the bottom and top of the real body is known as the lower and upper shadows (or wick).
4. When the market is Bullish (the closing price is higher than it opened), then the body is colored typically white or green. But when the market is Bearish (the closing price is lower than it opened), then the body is usually colored either black or red.

I hope now you can relate to all financial analysis black screens with green and red boxes. Please bear in mind, that Candlestick Charts don’t express the events taking place between the open and close price — only the relationship between the two prices.

# Let’s look a little more into correlation

Correlation is a statistical technique that can show whether and how strongly pairs of variables are related.

Auto-correlation, also known as serial correlation, is the correlation of a time-series with a delayed copy of itself.

Partial Auto-correlation — The partial auto correlation function can be interpreted as a regression of the series against its past lags.

A time series has 4 components as given below −

1. Level − It is the mean value around which the series varies.
2. Trend − It is the increasing or decreasing behavior of a variable with time.
3. Seasonality − It is the cyclic behavior of time series. i.e. Clear periodic pattern of a time series(like sine function)
4. Noise − It is the error in the observations added due to environmental factors. i.e. Outliers or missing values

# A little more on Seasonality…

Though we said it is the cyclic behavior of time series. i.e. Clear periodic pattern of a time series(like sine funtion) there are two time series Decomposing a time series means separating it into its constituent components, which are usually a trend component and an irregular component, and if it is a seasonal time series, a seasonal component.

## Non-Seasonal Data

A non-seasonal time series consists of a trend component and an irregular component. It has a pattern but it is not seasonal.

## Seasonal Data

A seasonal time series consists of a trend component, a seasonal component and an irregular component. It has a seasonal pattern like sales of umbrella hike in rainy season.

# A little more on Noise…

Tough we said that is the error in the observations added due to environmental factors. i.e. Outliers or missing values

It is characterized as white noise and is bit different from the outliers or missing values in traditional datasets

## White noise has…

Constant mean Constant variance Zero auto-correlation at all lags

# Random walk….

A random talk… Though not related have you guys read a random walk on wallstreer? It is a beautiful read for every naive person trying to predict. I call it so because i was the most naive and the article taught me a thing and two which i hold very important and formed basis of my masters thesis

Some random walk…

Random assumes that in each period the variable takes a random step away from its previous value, and the steps are independently and identically distributed in size.

A random walk is a mathematical object, known as a stochastic or random process, that describes a path that consists of a succession of random steps on some mathematical space such as the integers.

In general if we talk about stocks, Today’s Price = Yesterday’s Price + Noise

Pt = Pt-1 + εt Random walks can’t be forecasted because well, noise is random.

Random Walk with Drift(drift(μ) is zero-mean)

Pt — Pt-1 = μ + εt

Regression test for random walk

Pt = α + βPt-1 + εt Equivalent to Pt — Pt-1 = α + βPt-1 + εt

Test:

H0: β = 1 (This is a random walk) H1: β < 1 (This is not a random walk)

Dickey-Fuller Test:

H0: β = 0 (This is a random walk) H1: β < 0 (This is not a random walk) Augmented Dickey-Fuller test An augmented Dickey–Fuller test (ADF) tests the null hypothesis that a unit root is present in a time series sample. It is basically Dickey-Fuller test with more lagged changes on RHS.

# A little on Stationarity time-series…

A stationary time series is one whose statistical properties such as mean, variance, auto-correlation, etc. are all constant over time.

1. Strong stationary: is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time.
2. Weak stationary: is a process where mean, variance, auto correlation are constant throughout the time

Stationary is important as non-stationary series that depend on time have too many parameters to account for when modelling the time series. diff() method can easily convert a non-stationary series to a stationary series.

We will try to decompose seasonal component of the above decomposed time series.

This article and attached notebook covered and gave a gentle start to series where we cover into prerequisites about time series from the introduction, imports, visualization, decomposition, random walks

Now its time for quite a fair intermediate to advanced level of modeling using AR/MR/ARMA/ARIMA/VAR.

# AR models

A statistical model is auto-regressive if it predicts future values based on past values. For example, an auto-regressive model might seek to predict a stock’s future prices based on its past performance.

The auto-regressive (AR) model is arguably the most widely used time series model. It shares the very familiar interpretation of a simple linear regression, but here each observation is regressed on the previous observation. The AR model also includes the white noise (WN) and random walk (RW) models examined in earlier chapters as special cases.

A statistical model is auto-regressive if it predicts future values based on past values. For example, an auto-regressive model might seek to predict a stock’s future prices based on its past performance.
An auto-regressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The auto-regressive model specifies that the output variable depends linearly on its own previous values and on a stochastic term (an imperfectly predictable term); thus the model is in the form of a stochastic difference equation.

1. An AR(1) auto-regressive process is one in which the current value is based on the immediately preceding value,
2. while an AR(2) process is one in which the current value is based on the previous two values.
3. An AR(0) process is used for white noise and has no dependence between the terms.

AR(1) model
Rt = μ + ϕRt-1 + εt

As RHS has only one lagged value(Rt-1)this is called AR model of order 1 where μ is mean and ε is noise at time t
If ϕ = 1, it is random walk. Else if ϕ = 0, it is white noise. Else if -1 < ϕ < 1, it is stationary. If ϕ is -ve, there is men reversion. If ϕ is +ve, there is momentum.

AR(2) model
Rt = μ + ϕ1Rt-1 + ϕ2Rt-2 + εt

AR(3) model
Rt = μ + ϕ1Rt-1 + ϕ2Rt-2 + ϕ3Rt-3 + εt

# Let’s look at some implementation of MA models

The moving-average (MA) model is a common approach for modeling uni-variate time series. The moving-average model specifies that the output variable depends linearly on the current and various past values of a stochastic (imperfectly predictable) term.

A MA (moving average) model is usually used to model a time series that shows short-term dependencies between successive observations. Intuitively, it makes good sense that a MA model can be used to describe the irregular component in the time series of ages at death of English kings, as we might expect the age at death of a particular English king to have some effect on the ages at death of the next king or two, but not much effect on the ages at death of kings that reign much longer after that

The moving-average (MA) model is a common approach for modeling uni-variate time series. The moving-average model specifies that the output variable depends linearly on the current and various past values of a stochastic (imperfectly predictable) term.

MA(1) model Rt = μ + ϵt1 + θϵt-1

It translates to Today’s returns = mean + today’s noise + yesterday’s noise

As there is only 1 lagged value in RHS, it is an MA model of order 1

Here’s the difference between AR and MA models:

Pure AR Models — Depends on the lagged values of the data you are modeling to make forecasts

Pure MA Models — Depends on the errors(residuals) of the previous forecasts you made to make current forecasts

Mixed Models ARMA — Takes into account both of the above factors when making predictions

# Let’s look at some implementation of ARMA models

RMA model is simply the merger between AR(p) and MA(q) models

Autoregressive–moving-average (ARMA) models provide a parsimonious description of a (weakly) stationary stochastic process in terms of two polynomials, one for the autoregression and the second for the moving average. It’s the fusion of AR and MA models.

ARMA(1,1) model Rt = μ + ϕRt-1 + ϵt + θϵt-1

Basically, Today’s return = mean + Yesterday’s return + noise + yesterday’s noise.

ARIMA model shows much better results than AR and MA models.

# Let’s look at some implementation of ARIMA models

What Is an Auto-regressive Integrated Moving Average? An auto-regressive integrated moving average, or ARIMA, is a statistical analysis model that uses time series data to either better understand the data set or to predict future trends.

Understanding Auto-regressive Integrated Moving Average (ARIMA) An auto-regressive integrated moving average model is a form of regression analysis that gauges the strength of one dependent variable relative to other changing variables. The model’s goal is to predict future securities or financial market moves by examining the differences between values in the series instead of through actual values.

An ARIMA model can be understood by outlining each of its components as follows:

Auto-regression (AR) refers to a model that shows a changing variable that regresses on its own lagged, or prior, values. Integrated (I) represents the differencing of raw observations to allow for the time series to become stationary, i.e., data values are replaced by the difference between the data values and the previous values. Moving average (MA) incorporates the dependency between an observation and a residual error from a moving average model applied to lagged observations. Each component functions as a parameter with a standard notation. For ARIMA models, a standard notation would be ARIMA with p, d, and q, where integer values substitute for the parameters to indicate the type of ARIMA model used. The parameters can be defined as:

p: the number of lag observations in the model; also known as the lag order. d: the number of times that the raw observations are differenced; also known as the degree of differencing. q: the size of the moving average window; also known as the order of the moving average. In a linear regression model, for example, the number and type of terms are included. A 0 value, which can be used as a parameter, would mean that particular component should not be used in the model. This way, the ARIMA model can be constructed to perform the function of an ARMA model, or even simple AR, I, or MA models.

An autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. Both of these models are fitted to time series data either to better understand the data or to predict future points in the series (forecasting). ARIMA models are applied in some cases where data show evidence of non-stationarity, where an initial differencing step (corresponding to the “integrated” part of the model) can be applied one or more times to eliminate the non-stationarity. ARIMA model is of the form: ARIMA(p,d,q): p is AR parameter, d is differential parameter, q is MA parameter

ARIMA(1,0,0) yt = a1yt-1 + ϵt

ARIMA(1,0,1) yt = a1yt-1 + ϵt + b1ϵt-1

ARIMA(1,1,1) Δyt = a1Δyt-1 + ϵt + b1ϵt-1 where Δyt = yt — yt-1

# Let’s look at some implementation of VAR models

VAR models (vector autoregressive models) are used for multivariate time series. The structure is that each variable is a linear function of past lags of itself and past lags of the other variables. As an example suppose that we measure three different time series variables, denoted by xt,1, xt,2, and xt,3.

The vector autoregressive model of order 1, denoted as VAR(1), is as follows:

xt,1=α1+ϕ11xt−1,1+ϕ12xt−1,2+ϕ13xt−1,3+wt,1

xt,2=α2+ϕ21xt−1,1+ϕ22xt−1,2+ϕ23xt−1,3+wt,2

xt,3=α3+ϕ31xt−1,1+ϕ32xt−1,2+ϕ33xt−1,3+wt,3

Each variable is a linear function of the lag 1 values for all variables in the set.

# Credits

The series has references to several data sources, packages, research papers, blogs, books, vlog, practical advice, industrial work, and personal experiences. I thank everyone in the field and especially whose work I was exposed to be able to put this up together. All credits to the smart folks out there! this particular article has source

Also, this series is the first deliberate effort for establishing self proven ground without expectations of appreciations in a career ladder. While everyone lives amidst of unknown struggle and discovery of purpose I personally think it is okay to sneak a moment of a smile and express gratitude for all right or wrong life choices taken in varied circumstances as you couldn’t have chosen better otherwise.

Written by

## Data Driven Investor

#### from confusion to clarity, not insanity

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just \$5/month. Upgrade