Cryptocurrency (Litecoin) prediction using Time series algorithm
Time series algorithms have been popular because it is useful in predicting sales in all the industries since a lot of money could be saved due to accurate forecasts. They are also used in finance industries such as share market predictions and in the cryptocurrency market.
Litecoin is one of the popular cryptocurrencies bought by investors these days. I will show you step by step process to predict the High price of Litecoin based on the previous prices using Time series algorithm, ARIMA.
ARIMA stands for Autoregressive Integrated Moving Average. Let’s dive in to the steps,
Step 1: Obtain source data
One year of Litecoin market prices data is obtained from www.coinmarketcap.com. The data is converted into CSV format and can be downloaded here.
Step 2: Python code
Note: The below python code is executed on Jupyter Notebook. Please follow this link to download python and Anaconda for accessing Jupyter Notebooks.
The below commands imports the necessary libraries.
import warnings
import itertools
import numpy as np
import matplotlib.pyplot as plt
warnings.filterwarnings("ignore")
plt.style.use('fivethirtyeight')
import pandas as pd
import statsmodels.api as sm
import matplotlib
matplotlib.rcParams['axes.labelsize'] = 14
matplotlib.rcParams['xtick.labelsize'] = 12
matplotlib.rcParams['ytick.labelsize'] = 12
matplotlib.rcParams['text.color'] = 'k'
Import litecoin.csv file from downloaded location,
df = pd.read_csv("litecoin.csv")
Convert datetime in format — YYYY-MM-DD
df[‘Date’] = df[‘Date’].astype(‘datetime64[ns]’)
Data Preprocessing
Remove unwanted columns other than ‘High Price’ and ‘Date’
cols = [‘Open’, ‘Low’, ‘Close’, ‘Volume’, ‘Market Cap’]
df.drop(cols, axis=1, inplace=True)
df = df.sort_values(‘Date’) # Sort by ‘Date’ column if its not sorted already
df.isnull().sum() # check if there are any NULLs in the data
Set ‘Date’ field as the Index column for time series analysis
df = df.set_index('Date') #setting the Index to date
df.index
Data Visualization
df.plot(figsize=(15, 6))
plt.show()
Time series decomposition into trend, seasonality and noise
from pylab import rcParams
rcParams[‘figure.figsize’] = 18, 8decomposition = sm.tsa.seasonal_decompose(df, model=’additive’)
fig = decomposition.plot()
plt.show()
Time series forecasting with ARIMA (Autoregressive Integrated Moving Average)
ARIMA models are denoted with the notation ARIMA(p, d, q). These three parameters account for seasonality, trend, and noise in data:
p = d = q = range(0, 2)
pdq = list(itertools.product(p, d, q))
seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]
print('Examples of parameter combinations for Seasonal ARIMA...')
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[1]))
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[2]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[3]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[4]))
Parameter selection for ARIMA using grid search
In order to find the best parameters for the model, we use the grid search method.
for param in pdq:
print(param)
for param_seasonal in seasonal_pdq:
try:
mod = sm.tsa.statespace.SARIMAX(df,
order=param,
seasonal_order=param_seasonal,
enforce_stationarity=False,
enforce_invertibility=False)
results = mod.fit()
print('ARIMA{}x{}12 - AIC:{}'.format(param, param_seasonal, results.aic))
except:
continue
The one which has the lowest AIC value should be selected as ARIMA parameters. Upon executing the above code, the lowest AIC value appeared for the below combination
ARIMA(1, 1, 1)x(1, 1, 1, 12)12 - AIC:1861.6476619963505
Fitting ARIMA Model
The model is fit using the parameters obtained from the above step.
mod = sm.tsa.statespace.SARIMAX(df,
order=(1, 1, 1),
seasonal_order=(1, 1, 1, 12),
enforce_stationarity=False,
enforce_invertibility=False)
results = mod.fit()
print(results.summary().tables[1])
Run Model Diagnostics
This step is optional to find the discrepancies in the model.
results.plot_diagnostics(figsize=(16, 8)) #always run model diagnostics to investigate any unusual behavior.
plt.show()
Model validation
The set of commands are used for validating the model
pred = results.get_prediction(start=pd.to_datetime('2019-07-01'), dynamic=False)
pred_ci = pred.conf_int()
ax = df['2018':].plot(label='observed')
pred.predicted_mean.plot(ax=ax, label='One-step ahead Forecast', alpha=.7, figsize=(14, 7))
ax.fill_between(pred_ci.index,
pred_ci.iloc[:, 0],
pred_ci.iloc[:, 1], color='k', alpha=.2)
ax.set_xlabel('Date')
ax.set_ylabel('High Price')
plt.legend()
plt.show()
Producing and visualizing forecasts
pred_uc = results.get_forecast(steps=15)
pred_ci = pred_uc.conf_int()
ax = df.plot(label='observed', figsize=(14, 7))
pred_uc.predicted_mean.plot(ax=ax, label='Forecast')
ax.fill_between(pred_ci.index,
pred_ci.iloc[:, 0],
pred_ci.iloc[:, 1], color='k', alpha=.25)
ax.set_xlabel('Date')
ax.set_ylabel('High Price')
plt.legend()
plt.show()
The forecast shows that the share price will remain around 80 in the month of October 2019. The grey shaded area near the ‘Forecast’ graph is large — This means that the stock might fluctuate in future due to the variations observed in the past. Predicting the price of cryptocurrencies is generally difficult because it can fluctuate due to various factors such as investor mindset, season, financial news etc.
For now, you should have got a basic idea on how to implement ARIMA to any time series data. You could try the same model for other cryptocurrencies or stocks.
Happy coding! 😃