Applying Linear Regression on Bitcoin’s historical data
Hi folks! In this article I am going to share with you another learning experience in my path towards Artificial Intelligence: How I used a linear regression model to try to predict bitcoin’s price based on its historical data.
Do you think it is possible? Let’s see in the article.
The general approach
First things first, there’s a process (a set of quetions) that I’m using that is a kind of a “general approach” for Machine Learning problems, I learned that in codecademy’s course “Build a Machine Learning Model With Python”. Here is the deal:
- What do we want to answer / acomplish? Predict Bitcoin’s tomorrow price based on its historical data.
- What are relevant data to help us answer this question? BTC historical data provided by yahoo finance such as open price, close price, volume of negotiations, etc. The plan is to use this data to try to predict what will be the bitcoin’s value the next day.
- What are some data cleaning and feature engineering that can be done? Removal of empty values, feature normalization, maybe adding quadratic features.
- Which model best fits the problem? I’m going to use Linear Regression model because it is my object of study.
- What is our success metric? Are we looking for accuracy? Precision? How much? I don’t know yet.. 😎 I guess I will use the Mean Absolute Error to evaluate the model’s predictions and anything bellow 100 bucks on average would be considered a success.
- Use the model and present the results. Ok, let’s code!
Project Repository
You can checkout the project at https://github.com/marciojmo/stock-price-predictor.git
Getting Started
Okay, so the first thing I did was actually getting the data from Yahoo Finance. I’ve used pandas_datareader library to get the data straight from the internet (so cool), we may also change the ticker and grab any stock data we want. After doing that, I`ve used pandas dataframe .head() method to visualize what we got.
The Goal
Since my goal was to use just this data to predict bitcoin`s tomorrow price, I’ve added a new column named “Prediction” that is a copy of the “Close” column shifted one position up. This way every line on the dataset will have an array of features (including the close price) mapping to the bitcoin’s closing price of the next day.
Data Visualization
With everything set it was time to plot some values against the prediction price and see if they have some kind of a linear relationship (visually). I’ve used a for loop to iterate over all independent variables and plot them against the Prediction value (our dependent variable in this case).
Cleaning and Normalizing features
After visualizing the data I decided to take the Volume column out of the equation because it doesn’t seem to provide a good linear relationship with the prediction price.
I’ve also removed empty values using dataframe isin() function and normalized the independent variables using the MixMaxScaler from scikit preprocessing module.
Training and Testing
With our x’s and y’s set, is time to train and test our model against the data.
I’m using train_test_split() function from scikit model_selection module to split the dataset into training (70%) and testing (30%). Then I created a LinearRegression model from scikit linear_model module and trained the model on the training set by using the fit() method.
After doing that, I tested the model against the test data and used r2_score() and mean_absolute_error() functions from scikit metrics module to evaluate the model’s performance.
Mean absolute error: $ 632.79
Mean absolute percentage error: 2.53%
Coefficient of determination: 1.00
Results
As shown above, the model misses the tomorrow’s prediction price with an mean absolute error of $ 632,79 (2.53%). I really don’t know why the coefficient of determination is 1 in this case and I will let that to the statistics people to help me explain.
I’ve also ran the model against the last data row to predict tomorrow’s price and here is what we got:
The expected price for BTC-USD at 2021-09-11 is $ 45074.05
Let’s take a look at the Yahoo Finance website..
Pretty close, hum? Would you bet your money in this algorithm? I wouldn’t. haha. Better keep studying. See you!
Join Coinmonks Telegram Channel and Youtube Channel learn about crypto trading and investing
Also, Read
- YouHodler vs CoinLoan vs Hodlnaut | Cryptohopper vs HaasBot
- Binance vs Kraken | Dollar-Cost Averaging Trading Bot
- How to buy Bitcoin in India? | WazirX Review | BitMEX Review
- Bitcoin Taproot | Bitso Review | Top 6 Bitcoin Credit Card
- Gemini vs Coinbase | Coinbase vs Kraken | CoinJar vs CoinSpot
- Indian crypto exchange| Bitcoin Savings Account | Paxful Review
- Leveraged Token | Best Crypto Exchange | AscendEX Review
- Godex.io Review | Invity Review | BitForex Review | HitBTC Review
- Binance Fees | Botcrypto Review | Crypto.com Alternatives
- What are the Trading Signals? | Bitstamp vs Coinbase | Buy Solana
- ProfitFarmers Review | How to use Cornix Trading Bot
- MXC Exchange Review | Pionex vs Binance | Pionex Arbitrage Bot
- My Experience with Crypto Copy Trading | Coinbase Review
- CoinFLEX Review | AEX Exchange Review | UPbit Review
- AscendEx Margin Trading | Bitfinex Staking | bitFlyer Review
- Sparrow Exchange Review | Nash Exchange Review
- Cryptocurrency Savings Accounts | Staking Crypto | Crypto Trading Bots
- BigONE Exchange Review | CEX.IO Review | Swapzone Review
- Best Bitcoin Margin Trading| Bityard Margin Trading
- Crypto Margin Trading Exchanges | Earn Bitcoin | Mudrex Invest
- WazirX vs CoinDCX vs Bitbns | BlockFi vs CoinLoan vs Nexo