How I Created a Bitcoin Trading Algorithm Using Sentiment Analysis With a 29% Return
TL;DR: I’ve created a formula that predicts whether you should buy or sell Bitcoin based on daily exchange price data and Google Trends keyword sentiment. The model produced a 29% return over 90 days for a $28,839 profit.
To what degree can Bitcoin (BTC) price be predicted? What if publicly available data from Google Trends can help forecast price fluctuations?
In other words can we reliably build a formula that can outperform the market? These are the questions which I sought answers to. My goal was to try to make sense of a highly volatile, scary and seemingly unpredictable cryptocurrency market.
There are many traders that swear by technical analysis and others that go more the fundamental analysis route. The truth is that there is no magic trading strategy that always out-beats the market. There are far too many variables that even the best AI-based trading algorithms cannot consistently profit from.
The formula today is very basic and my intention is to present in its raw form and solicit feedback on how to make it better. This is a work-in-progress and by no means fool-proof so please use at your own risk.
The truth is that there is no magic trading strategy that always out-beats the market.
I have been testing formula of what I believe to be a relatively consistent indicator of BTC price performance. Specifically I was able to model a 29% profit over a 90-day period using $100,000 as the initial investment. Note that this does not take into account exchange trading fees which I hope solutions such as decentralized exchanges will one day eliminate.
Here is the process that I used:
- I searched Google Trends for “BTC USD” and “Buy Bitcoin” over the most recent 90-day period:
2. I noticed that when the “BTC USD” to “Buy Bitcoin” ratio is less than ~3:1 (specifically <35%) at the BTC price “close” for the day, the following day’s close price increases. If more than a ~3:1 ratio (specifically >35%) (i.e. 4:1 or 5:1) then its a signal to sell because the subsequent day’s price decreases.
3. Next I tested when the BTC price difference closes more than $80 above the prior day’s close price, this makes the pattern more consistent. $80 is an arbitrary value that performs well in this dataset. Here’s a screenshot of what this looks like:
BTC USD: Daily indicator directly from Google Trends.
Buy Bitcoin: Daily indicator directly from Google Trends.
Price: Current day’s close price from Coin Market Cap.
Column E: “Buy Bitcoin”/”BTC USD” ratio
Column F: The Buy/Sell decision formula. For example here is the formula for cell F20:
=if(AND(E20>35%,G20>80),”BUY”,”SELL”) . Note that “35%” is the threshold to Buy along with the price being greater than “$80”.
Column G: Bitcoin price difference from prior day’s close.
Column H: Running total based on an initial $100,000 investment on 7/7/2018 (the first Buy).
Results of the Model and Next Steps
So over a 90-day period a $100K investment becomes $128,839 in my model — almost a 29% return. But this is far from an optimized model and there are several things that I’d like to optimize.
The “>35%” and “>$80” are rather arbitrary based on what seems to work in this limited 90-day dataset. Is there a better formula that will yield a better Buy/Sell signal?
These variables seem to work at the given $6k–8k BTC price level. I would like to test more historical data over the past year or two. The model would compare the total earnings from Buy/Sell signals using an array of (~3:1–~5:1) and the “$80” would instead be a fixed percentage of the daily BTC price so that it could account for major price spikes. For example perhaps the optimal model ends up being a 3.23–1 ratio at 0.014543229 of the daily price fluctuation.
The variable input matrix would look something like this:
If You Are a Data Guru Let’s Talk
In other words I want to setup a test to find the optimal variables to plug-in that maximizes profit for the given dataset. This would involve regression testing against past price and sentiment data. My hypothesis is that there are optimal variables at different price levels.
I’m currently testing a “v2” of this algorithm and would love to collaborate with any data gurus with the R or Python chops to run a full regression and goal seeking script to optimize the algorithm. Feel free to drop me a comment or a private note and I will be in touch.
To the moon! 🌕
UPDATE: Due to some excellent community feedback and some interesting new patterns found I’ll be doing a follow-up series with my v2 edition. If you want to be the first to see my new formula follow me here on Medium.