Building some node js tools to setup machine learning trading pt 3

Fo0
5 min readNov 12, 2018

--

This is part 3 of a series on setting up some tools to get started with machine learning for crypto trading. If you haven’t been following up to this point, the first entry can be found at https://medium.com/@kid.bytes/building-some-node-js-tools-to-setup-machine-learning-trading-pt-1-fdbb40439384

Continuing the exchange importer

If you’ve been following along up to this point we’ve:

  • Created a basic logger
  • Began an exchange importer object
  • Began a data manager object

Next we need to make use of the exchange importer and the data manager to import and store the data.

First thing’s first. I mentioned in the last article that we should probably not have the dataManager read from the process.env directly. It doesn’t follow the separation of concerns principle. I began writing some tests on my local copy to keep myself in check and give me confidence in my code and it also is just easier to have them passed in as parameters. So let’s make that change. Below are the main.js the exchangeImporter.js and the dataManager.js as they are currently.

Main.js

exchangeImporter.js

dataManager.js

Now that we’re all on the same page lets first add a line in the main.js after the exchangeImporter.init(exchange, dataDir, dbExt)line we’ll add exchangeImporter.getPair(args[1]). This function doesn’t exist yet so let’s write it. In our exchangeImporter we add:

The comments in the code should help explain what’s going on here but let’s step through it.

First we make sure the pair is specified before trying to get data. Next we set a variable to id of the newest trade in our DB. Again you’ll notice we’re calling a function that we haven’t written yet, so we’ll write that in our dataManager here shortly.

Just a short quip on the above. I notice a lot of new developers have a hard time with this sort of thing where they’re calling a function that hasn’t been written yet. Don’t get stuck on the fact that it doesn’t exist yet, we can continue to write our function and it’ll help us flesh out what we’re going to need to write to support it. Sometimes a thing like the getNewestTrade is something we think we need until we get most of the way through our function and we realize we don’t. In that case we would save the time of not having to write it in the first place. It also helps us figure out exactly what we want returned from the function.

Next we setup a do while loop and we call a fetchTrades function that we’ll need to write next. Our exit condition is going to be when we receive less than the requested amount of trades so our fetchTrade function will need to return a count. The case that we get back less than the max amount of trades should only occur if there have been less trades than that since the fromId we’re passing in, which should mean that it’s the most current data and there is no more to import.

Last we drop a message in our Logger and handle the error cases. The function is declared as async just so that we can await the call to fetchTrades.

Okay we’re getting there. Let’s tackle the fetchTrades function first.

So this is fairly straight forward. The first thing we do is call fetchTrades on our cctx exchange that we initialized in the init function and assign it to a batch variable. Next we store the trades with our dataManager.storeTrades function that we wrote last time. Remember we need this function to return the number of trades received so that the getPair will be able to exit the while loop.

The other thing to note here is through testing I would occasionally get a time out error from the exchange and the program would stop. So in our catch block we check for that and if it’s a timeout error, we just call the fetchTrades again with the same parameters.

Alright now let’s add what we need in our dataManager

Here’s our getNewestTrade function that we called in our exchangeImporter. It’s pretty simple. We return a promise since the call the db is an unknown amount of time and we need to execute when it’s complete. Unfortunately the sqlite3 library doesn’t use promises and instead uses callbacks so we can’t just await it. We grab an instance of a db connection and just run a query to grab our largest tradeId from the database and resolve with that or 0 for the case that there is no data yet.

That’s it! We’re ready to import trade data for any pair we want from binance’s public api. Currently a decent pair to test with is PAX/USDT. It’s a newer market so you’ll be able to import all of the data in a few minutes, where as trying something like BTC/USDT will take hours.

We also haven’t written anything to start from a certain point so it will get all of the data for the pair, which in some cases will take a long time. At the time of writing this the BTC/USDT pair has around 70 million trades on binance and max we’ll be importing is 2k/s. Which is just shy of 10 hours, in practice it took my 11.5 hours.

If we run npm start import PAX/USDT and our log level is set to debug in our .env we’ll start to see some output.

That’s good for this entry. The next entry we’ll look at converting our trade data into candle data commonly used for financial charting.

I’m probably going to start a github repo for the series with folders for each blog entry. The code base is going to grow and that’ll be easier to share with y’all. I’ll still create gists for sharing as we build this thing but you’ll be able to find a compete project at each stage rather than me sharing every file at the start of each blog entry.

Thanks for reading! If you have any suggestions I’d love some feedback, drop me a response and clap if you like the content. If you don’t like it, let me know why so I can improve ;).

Edit: New article is up at https://medium.com/@kid.bytes/building-some-node-js-tools-to-setup-machine-learning-trading-pt-4-9f1f58053a5f

--

--