Datasets Digest | Spring 2017

Sanjana Jhaveri
data.world
Published in
5 min readJun 1, 2017

Spring was an exciting time here at data.world, with featured datasets ranging from SXSW Twitter Traffic and March Madness Predictions to World Happiness Reports for the United Nations. We hope you’ll get a chance to dive in and make your own conclusions on the data that makes us tick.

Sports

NCAA Men’s March MadnessMen’s March Madness historical results, 1985–2015

Tableau Viz by data.world member Rody Zakovich

2017 March Madness PredictionsForecast data for the 2017 Men and Women March Madness tournament, and team rankings

NCAA Tournament ResultsEvery NCAA tournament game result since 1985 (when the tournament was expanded to the 64 team bracket)

Major Sports Venues UsageRepresents teams or events that are associated with 12 major sports leagues

Toughest Sport by Skill 60 sports ranked across 10 skill categories by an ESPN panel to determine the most difficult sport

NBA SalariesSalaries of NBA players from 1990 to 2016

Science & The Environment

HD6D LIDAR High Speed DescentNASA’s HD6D LIDAR for High Speed Descent Mapping Project

Global Footprint Network National Footprint AccountsNational Footprint Accounts (NFAs) measure the ecological resource use and resource capacity of nations from 1961 to 2013

Marches for Science, Domestic Crowd SizesEstimated crowd sizes for marches in approximately 200 cities in the U.S.

US National Parks Visitation 1904–2016All United States National Parks from 1904–2016 with geographical boundary lines and visitation numbers by year

Chlamydia by StateThis dataset shows rates of Chlamydia by state, 2000–2015

Society

Open Sourcing Mental Illness Data on prevalence and attitudes towards mental health among tech workers

Social influence on shopping Survey of 2,676 millennials: What social platform has influenced your online shopping the most?

Teen fake news poll on After SchoolBuzzFeed partnered with After School to ask 39,000 high school students about their opinions on fake news. Here’s what they said.

Cat vs. dog popularity in the U.S. Population and ownership by household of dogs and cats broken down by state via the American Veterinary Medical Association

Viz by data.world member Andrew Duff (@datanerd)

2017 SXSW Twitter Traffic — A collection of all tweets that mention #sxsw or SXSW

Missing children in the USThe National Center for Missing and Exploited Children (NCMEC) list of missing children across the US

J.K. Rowling tweets and retweets10,159,892 identifiers for tweets and retweets sent by or to J. K. Rowling, @jk_rowling

Every Donald Trump tweetWhether you’re politically on the right or on the left, dig into the data for this challenge and tell us what you think!

Stand-up on Comedy CentralAll episodes in 15 seasons of Comedy Central Presents, a standup comedy series that featured 260 comedians

Barks for BeersChronicling visits to 30 Austin breweries for Divine Canines’ “Barks for Beers” fundraiser

Mock presidential election poll for teens — A mock 2016 presidential election poll taken mid-October by over 100,000 teens in the United States on the After School App

Washington Post police shootings — The Washington Post is compiling a database of every fatal shooting in the United States by a police officer

Houston email metadataEmail address metadata from the City of Houston obtained by FOIA by @chaps on the Sketch City Slack channel

Data for Democracy

Datasets built out by the Data for Democracy community, a diverse group on a mission to democratize data.

Election TransparencyThis project analyzes elections in an effort to identify trends, outliers, and/or anomalies to enable insight and transparency into the democratic voting process

Drug spendingThis group is finding ways to make Medicare drug spending data more consumable

Internal displacementThis project aims to classify, tag, analyze and visualize news articles about internal displacement, and is based on a challenge from the IDMC

Propublica — campaign spending Analyzing campaign spending data to support the non-profit investigative journalism publication, ProPublica

Propublica — foreign travelA web scraping/data engineering project around foreign travel expenditures

Propublica — house expendituresA dataset on House Office expenditures

Economy

United Airlines Data The data has been selected and analyzed to present a view of the industry and its important trends, as well as to identify fundamental drivers of success — and in some cases, the early signs of potential failure

Beer dataUS brewery production of beers & cans, kegs & barrels, and taxes determined

Growth rates of industries through history Comparison of growth rates of industries, startups and public stocks during times of industry disruption

Special economic zones by countryCreating the world’s first database of all special economic zones: their location, value, and size

International

Population, growth rates and population density São PauloPopulation parameters including total amount, density, and growth rates, broken out by district in the city of São Paulo, Brazil

The CNS North Korea Missile Test DatabaseNorth Korean missile tests since 1984

Viz by data.world member Edoardo Piccari (@edopic)

Indian retail pricesRetail prices of key commodities in India from 1997 to 2015

World Happiness Report The first World Happiness Report was published in April 2012, in support of the UN High Level Meeting on happiness and well-being

Mines in AfricaNumber of mineral mines (total and by commodity) for 5,835 African ADM2 units

Stay tuned for our next Datasets Digest compilation! If you liked this Digest summary, we encourage you subscribe to our weekly Datasets Digest email and share your favorite datasets with friends, family, and data enthusiasts alike.

Data work is much easier when everyone can contribute to it. Learn how to use data.world to collaborate with your professional teammates on your data projects here.

--

--