Cryptocurrency Data Analysis Part IV: Connecting to Exchange Live Feed
In previous article we explored a very popular algorithmic strategy. However, analysing strategies with sampled data removes us from all the rigour involved in real-time event processing. My specific area of interest is market microstructure of cryptocurrency. In order to meaningfully study this discipline, I require the data of finest granularity, i.e. message by message data of all the events from the orderbook. This type of data is essential for high-frequency trading (HFT) in any asset class and crypto is not an exception.
In this article, we will connect to a websocket API of Gemini Exchange. Namely, we will implement a Websocket client that receives one-way market data from the exchange. Without further ado, let’s get into it by writing the script that will connect to a live message feed (the code is actually taken from the Gemini API Docs):
If you run the script, the interpreter will print the messages that it receives from the websocket. At this point we are simply printing the message strings and are not doing anything with them thereafter as you can see in definition of
on_message function. A sensible thing to do is to collect the messages in their raw form to a text file, which we can do by slightly modifying our script:
If you run the script now, you will see a file named
BTCUSD.txt in your current directory that is being filled up with raw messages. You may want to experiment with this script by subbing in other tickers to
pair variable. Ideally you want around 24 hours worth of data in order to conduct meaningful analysis. To mitigate the internet outages, you can use a free AWS EC2 instance and leave the scripts running for as long as needed. Be aware that exchange maintenance events will most likely interrupt the connection to your client, so it is sensible to check the state of your program every 24 hours.
As always, leave your questions and comments below and happy coding!