Data, Data, Everywhere
The datasets of the novel coronavirus and COVID-19 pandemic.
I think it is probably fair to say that we have all become a little obsessed with the novel coronavirus. As a data scientist I have been amazed to see how quickly organisations and researchers from around the world have pulled together a vast treasure-trove of relevant information, from scientific publications, which might otherwise have remained out of reach behind publisher firewalls, to a wealth of deep and detailed datasets.
In an effort to learn what I can about the pandemic I plan to produce a series of data science articles as I work through some of the more accessible and interesting research questions that can be asked of these datasets. The code and data that I will be using should be available from GitHub.
The main objective of this short article is to provide links to a growing list of relevant data and repositories. I’ll try and keep it up to date and as organised as I can over the coming weeks. If you have any suggestions about what’s missing then please let me know by replying to this article.
- European Centre for Disease Prevention and Control
- Johns Hopkins Coronavirus Resource Center
- Kaggle Novel Coronavirus 2019 Dataset
- The Coronavirus Tech Handbook
- Dataset of COVID-19 related tweet-id
- COVID-19 Image Dataset (X-Rays/CT)
- WHO COVID-19 Situation Reports
- Worldometers Coronavirus Data
- Tableau COVID-19 Datahub
- ACAPS COVID-19: Government Measures Dataset
- COVID-19 Global Travel Restrictions and Airline Information
- COVID-19 Testing by Country
- Global School Closures COVID-19
- Novel Coronavirus Information Center
- Google’s BigQuery Public Dataset Programme
- Global Health Security Index
- Containment Measures
What am I missing? Please do let me know by replying/commenting on this blog post with a url.