Cryptocurrency Datasets on Kaggle

Megan Risdal
3 min readJun 15, 2017

--

From Bitcoin to Dogecoin, some of the world’s most popular cryptocurrencies experienced a healthy shock a few days ago. Check out the screenshot from coinmarketcap.com taken on June 12th.

Credit to Mikel Bober-Irizar (AKA Anokas on Kaggle) for sharing this screenshot with me from 12 June 2017.

In this post, I’m sharing a couple of cryptocurrency datasets that other Kagglers have published and recently refreshed with the latest data.

Why cryptocurrencies?

It appears increasing interest in cryptocurrency markets is emerging on Kaggle Datasets. Not being particularly tuned into what’s going on in this world, I half jokingly speculated that the WannaCry ransomware attack last month (which yielded very little in the way of crypto-spoils for its architects) was driving this interest.

Okay, probably unrelated… And in fact, Mikel tells me the main problem looming over Bitcoin at the moment is a scaling issue which will be resolved by a chain split activating on August 1st of this year. Cryptocurrency investors and fans will be watching closely as two versions of Bitcoin (one with the new rules allowing miners to make faster blocks and one without) results in either chaos or reward. This blockchain of forking paths could have ripple effects across all cryptocurrencies, not just Bitcoin.

Anyway, as all interested eyes are watching (and mining), Kagglers are sharing cryptocurrency datasets with the data science world on Kaggle. One is an Ethereum Historical Dataset which contains all data from its conception to today. The other is a Bitcoin Historical Dataset containing data at one-minute intervals back to 2012.

Kagglers have started a couple of analyses using Kernels to explore the data — specifically using Python notebook kernels (powered by Jupyter). As you can see, both Ethereum and Bitcoin have skyrocketed in value:

Ethereum price over time, code

From Liam Larson’s kernel “Analyzing the Ethereum Block Chain”

Bitcoin price over time, code

From Dylan’s kernel “Bitcoin Exploring”

Get the datasets

Intrigued and want to start analyzing? You can download the datasets from Kaggle. Check out the links we shared when we first featured them:

Some things you can do with these datasets:

  • Experiment with trading strategies
  • Compare Bitcoin and Ethereum in one analysis by adding both data sources to one kernel (I show you how to do this in a tutorial here)
  • Publish historical datasets from other cryptocurrencies and invite the community to collaborate

Thanks to Mikel for teaching me a thing or two about the dramatic world of Bitcoin and cryptocurrencies! The story is a lot more complicated and nuanced than I’ve laid out of course, but I hope you’re at least inspired to learn more like I know I am.

Interested in starting a data project on Kaggle? Reach out to me at @MeganRisdal on Twitter and I’d be happy to help you or your organization get started.

--

--

Megan Risdal

Kaggle / Google Product Manager. Former Stack Overflow Product Manager. Passionate about open communities and open knowledge.