Naive Persistence Model: A Baseline Forecasting Technique in Quantitative Finance

Andreas
Coinmonks
4 min readOct 5, 2023

--

Implementing and Evaluating a Naive Persistence Model Using Bitcoin Prices.

Accurately predicting the prices of stocks or cryptocurrencies is a coveted skill. When dealing with time series forecasting, it is essential to have a baseline model that serves as a benchmark against which we can compare the performance of more sophisticated models.

In a previous article, we covered the random binary prediction model as a baseline for predicting the stock price of Dassault Systèmes. This model, rooted in random chance, helped us evaluate the efficacy of more advanced predictive models.

Today we will have a look at another fundamental baseline model: the naive persistence model. This model predicts that the future price of an asset will be the same as its most recent price in the context of finance and trading. Unlike the random binary prediction model, the naive persistence model does not rely on randomness. Hence, the omission of np.random.seed() in our implementation since no random numbers are generated.

DALL-E generated

This article will guide you step by step in implementing and evaluating a naive persistence model. We will focus on the price of Bitcoin.

Outline:

  • Part 1. Implementing the naive persistence model
  • Part 2. Understanding and evaluating the naive persistence model

Although we’ll use real market data, please keep in mind that this article is a tutorial, not financial advice, and aims to enhance our finance knowledge and Python skills.

Part 1. Implementing the naive persistence model

The naive persistence model is straightforward and does not account for market dynamics or historical trends. It takes the assumption that future prices will mirror current prices. The naive persistence model predicts… persistence in price. Tomorrow’s Bitcoin price will be the same as it is today!

The random binary prediction model employs random chance to predict prices movements. The naive persistence model predicts persistence in price.

import pandas as pd
import yfinance as yf
from sklearn.metrics import mean_squared_error
from math import sqrt

# Retrieving BTC prices
data = yf.download("BTC-USD", start="2017-01-01", end="2023-01-01")
btc_prices = data['Close'].values

# We split the data to get the training and testing sets
train, test = btc_prices[0:-10], btc_prices[-10:]

# Naive persistence model
predictions = train[-10:]

# Evaluating the model
rmse = sqrt(mean_squared_error(test, predictions))
print('RMSE: %.3f' % rmse)

Part 2. Understanding and evaluating the naive persistence model

We first fetch Bitcoin prices from Yahoo Finance. The data is then split into training and testing sets.

Here we have:

  • train that contains all the price data except for the last 10 closing prices.
  • test that contains those last 10 closing prices.

The naive persistence model then “predicts” that the next 10 prices will be the same as the last 10 observed prices in our training set. Persistence assumption is in action! In our code, the last 10 points (closing prices) of train are taken as predictions.

Evaluating the naive persistence model

To recap, we have the following arrays:

  • test that contains the actual values (last 10 true prices).
  • predictions that contains the predicted values (last 10 prices from the training set).

We evaluate the model with the root mean squared error or RMSE. It’s simply the square root of the mean squared error or MSE.

The mean squared error or MSE measures the average of the squares of the errors, i.e., the average squared difference between the estimated values (the predicted prices) and the actual value (the true prices).

Diving into RMSE

The RMSE is a performance indicator. It gives an estimate of how well a model is able to predict the target value, meaning that it gives us an indication of how accurate our predictions are compared to the actual values. It serves as a baseline metric.

The lower the RMSE, the better the model and its predictions. A higher RMSE indicates that on average the predictions are deviating more from the actual values, implying larger errors between the predicted and actual values: the RMSE is sensitive to the scale of the errors.

Note that the RMSE gives higher weight to larger errors. This means that the RMSE is useful when large errors are undesirable. In the context of predicting the price of bitcoin, we are particularly interested in minimizing large prediction errors due to the potential significant impacts on trading decisions.

Keep in mind that a “good” or “acceptable” RMSE depends on the context and the specific application. Sometimes a model with a relatively high RMSE might provide valuable insights.

Interpretation

We get an RMSE (Root Mean Squared Error) of 514.714. It indicates that on average the model’s predictions are approximately 514.714 units away from the actual observed values. Here, we are trying to predict the price of Bitcoin so the model’s predictions are on average 514.714 USD off.

RMSE: 514.714

We now have our benchmark metric. In the future, more sophisticated models should aim to have a lower RMSE to improve predictive accuracy.

Remember, the RMSE provides valuable insights into a model’s performance but decisions should be made holistically!

Moving Forward

To sum up, the naive persistence model can be used as an essential baseline for forecasting. The model assumes that future prices will mirror current ones and offers a starting point against which we can evaluate more complex models. Always refer back to it to ensure that newer models provide added predictive value!

Thanks for reading and stay tuned for more articles! If you enjoyed reading this article, please follow and clap for more full Python scripts.

You can also follow FinancePy on Substack

Ressources

“Mean Squared Error (MSE) | Lesson 9.1.5.” ProbabilityCourse.com, https://www.probabilitycourse.com/chapter9/9_1_5_mean_squared_error_MSE.php. Accessed 05/10/2023.

--

--

Andreas
Coinmonks

Data Scientist at a digital asset hedge fund. Formerly in M&A. Data science for in-depth research on asset allocations and strategies.