Ultimate DataSet Resource Hub
Start Playing with Data Today!
Kaggle has come up with a platform, where people can donate datasets and other community members can vote and run Kernel / scripts on them.
The open data from the World bank. The platform provides several tools like Open Data Catalog, world development indices, education indices etc.
Here is a link to datasets used by Five Thirty Eight in their stories. Each dataset includes the data, a dictionary explaining the data and the link to the story carried out by Five Thirty Eight.
Amazon provides a few big datasets, which can be used on their platform or on your local computers.
Google provides a few datasets as part of its Big Query tool. This includes baby names, data from GitHub public repositories, all stories & comments from Hacker News etc.
A few months back, Google Research Group released YouTube labeled dataset, which consists of 8 million YouTube video IDs and associated labels from 4800 visual entities. It comes with pre-computed, state-of-the-art vision features from billions of frames.
Quandl provides financial, economic and alternative data from various sources through their website / API or direct integration with a few tools. Their datasets are classified as Open or Premium.
Driven Data finds real-world challenges where data science can be used to create a positive social impact
Important, commonly-used datasets in high quality, easy-to-use & open form as data packages
Old archives of websites that no longer exist. Includes data on the affinities of 60,000+ Reddit users
Datasets and requests for datasets
provides a comprehensive list of open data portals
A github repo of clean datasets — lots of variety
Datasets to practice data mining on
great dataset resource
the world’s broadest collection of public data
The Global Open Data Index (GODI) is the annual global benchmark for publication of open government data
Access data and statistics instantly with smart search.
A comprehensive list of 2600+ Open Data portals around the world
The 1000 Genomes Project ran between 2008 and 2015, creating the largest public catalogue of human variation and genotype data.
Country Datasets coming up
This is the home of the U.S. Government’s open data. The site contains more than 190,000 data points at time of publishing. These datasets vary from data about climate, education, energy, Finance and many more areas.
This is the home of the Indian Government’s open data. Find data by various industries, climate, health care etc.
Find data published by the United Kingdom central government, local authorities and public bodies to help you build products and services
The European Union Open Data Portal (EU ODP) gives you access to open data published by EU institutions and bodies.
If you liked this article, please give us a clap. Thank you for reading.