Favorite Places to Find Datasets

Spice up your machine learning project with novel data

Barrett Studdard
Analytics Vidhya

--

Photo by fabio on Unsplash

Interesting datasets can make personal machine learning projects more fun and exciting. Here are some of my favorite places to go looking for datasets to hone my data science and ML skills.

Data Is Plural

Data Is Plural is my favorite place to find novel datasets on interesting topics. The site is managed by Jeremy Singer-Vine.

Each edition (250+ and counting!) is published weekly and contains descriptions of various datasets and what makes them interesting. There are usually 5 or so different entries in each week. One nice aspect is some datasets are more raw (depending on the source) and you can practice working with data before modeling.

Ease of Use: Medium

Interesting/Novel Datasets: High

Kaggle

Kaggle is known for machine learning competitions. However, one interesting aspect is the datasets feature. Users can post datasets and collaborate with tasks and discussion around them.

There are various levels of how clean datasets are, but users can receive medals for how good they are. This often results in good quality datasets for those that rate highly, with data types and complete…

--

--