Building some node js tools to setup machine learning trading pt 2

This is part 2 of a series on setting up some tools to get started with machine learning for crypto trading. If you haven’t seen it the first entry can be found at

Store some data

Last time we walked through setting up our project and getting a basic logger going. If you were following along we had the bare bones for our project and left off with three lines that output some info to the console just to test our Logger. First thing’s first. Go ahead and remove those lines.

Next you can’t do any sort of machine learning without data so let’s get some! For this we’re going to start off by just grabbing data from binance via their public api. Something to consider here is that we’ll want to make sure we allow ourselves to easily add other exchanges in the future. Or at least not make it any harder than it needs to be. With this in mind I’m going to start with using an api wrapper that has support for a bunch of exchanges. It’s open source and I’ve poked around and haven’t found anything that sets off any kind of flags.

It’s called cctx and you can find the source here With that out of the way lets go ahead and install it.

yarn add cctx

Now we’re also going to need a place to put this data. We’re going to start with just using an sqlite database. We can easily switch to something like postgresql without too much trouble in the future and the sqlite db doesn’t require anything to be installed on our machine to run it, so it’s great for just getting something up and going but it’s not terribly efficient. It’s little more than a glorified flat file storage. However we get the benefit of writing sql queries that will more than likely be compatible with whatever sql db we choose to migrate to in the future. So let’s install the npm package for it.

yarn add sqlite3

Now we can create a new exchange importer. Make a new folder called exchangeImporter and inside create a new file exchangeImporter.js. In the exchangeImporter.js let’s start by importing the cctx library and our logger. After that I’m going to hard code a limit that we’ll be able to easily change to a configuration variable of some sort in the future. I’m going to set it to 1000 because that’s the limit of trades we can get from binance at a time.

Before I dive into some code here. For this project I’ve decided I want to adopt the OLOO pattern. You can read about it in Kyle Simpson’s fantastic you don’t know js series. This is the first time I’m really going to try to use this pattern so we’ll learn it together. I encourage you to read up on it but if you prefer, a class approach is fine for this application as well. Okay with the disclaimer out of the way let’s go ahead and write our init function.

Not a lot going on here, just wiring some things up. I try to use jsdoc blocks when I remember to and if you’re using vscode you can install the npm intellisense plugin to get helpful auto completion while you’re writing your code along with your own jsdoc comments.

The other thing to note is in the init function we’re requiring an exchange to be passed as a string. We instantiate a new ccxt[exchange] and pass in a config object that sets the built in rate limit to true. I think it’s set 2 / second by default, but I can say I haven’t had any issues with hitting rate limits while it’s on. Last we use our Logger to give us a message so we know something is going on.

Simple enough, let’s get it wired into our main.js.

Again nothing complicated going on yet. We require our new exchange importer. Maybe we’ll build out a ui in the future but for now the cli will suffice so we setup some argument processing. We ignore the first two arguments they’ll be npm and start. Since we’ll have multiple tools our first parameter is going to relate to which tool we want to run. If an invalid function is input from the cli we give an error.

If we run npm start import we’re going to make sure a pair was specified to be imported, if not we’ll again give an error and the process will end. If so we go ahead and create a new object with our exchange importer prototype and initialize with our exchange. If we run it now with npm start import BTC/USDT we’ll get an output from our exchange importer of Exchange importer initialized. Great we’re wired up!

EDIT: I mentioned using npm start here vs the previous node main command but forgot to share the necessary change to our package.json for that to work. We need to make sure we have the start script defined in our package.json as follows.

"scripts": {
"start": "node main"

Getting the data

Now we need a place to store our data. Let’s create a new folder ./data and new environment variables in our .env (don’t forget to place them in the .env.example as well!) DATA_DIR=./data/ and DB_EXT=db. Next let’s create a new data manager object to handle i/o to our database. Create a new folder called dataManager and a new file called dataManager.js. Inside this file we’re going to require our sqlite package and our logger. Then write a basic store trade function.

Let’s talk this out a little. So the logger is making assumptions about the environment variables existing. We’ll probably change this later and pass them into the init just to keep a separation of concerns, for now this is fine.

When a new dataManager is created and initialized we store the exchange to be used for the file name of the database. If it doesn’t exist or the environment variables aren’t set in our .env we’ll get an error message.

Next we have the getDb function. It just returns an instance of an active connection the the sqlite db. We must remember to close it when we’re done.

Then where the magic happens is in the storeTrades function. A quick note is we’re using the newer async await syntax here which is just a syntax convenience basically and being that the sqlite3 package doesn’t use promises, it’ll help us to try to avoid callback hell. We wrap the whole thing in a try catch so we can error gracefully when things go wrong and provide a user friendly error message.

First we make sure there is a dbFile specified from our init function and then make sure the data being sent is defined and is an array. Since we’re really writing this for ourselves our array check isn’t the greatest but is good enough for this purpose even those a string would also have a length.

We next grab an instance of a connection to the db and extrapolate a table name from the data being passed. Set a query to create a new table to store the trade data unless it already exists. We begin a serialize function to ensure the nested sql queries get performed serially.

It runs the query to create the table if necessary and begins a transaction. If we didn’t begin the transaction and our data was 1000 trades that would require 1000 separate queries to the database. By beginning the transaction we’re telling sql to hold on to the queries that follow until we commit them.

Then we create a prepared statement and execute it with the data from each row. We finalize the prepared statement when we’re done and then commit the queries.

Finally close the db connection and give a debug message.

I think that’s enough for this entry. Next time we’ll finish actually importing the data and think about how to process it into candles. Thanks for reading.

Part 3 is up