Data Digest | Summer’s end 2017

Sanjana Jhaveri
data.world
Published in
5 min readSep 13, 2017

As the seasons change, try practicing your exploratory analysis skills on the featured datasets and Data Projects that kept us cool through the summer! Gather insights on the Trump administration, compare literacy rates in India’s Telangana, and collaborate with others on a Bigfoot sightings Data Project. The volleyball is in your court.

Sports

Liverpool English League Matches — Liverpool Football Club’s English League results from 1893 to 2016

Viz by data.world community member Ryan Estrellado (@ryanes)

Indian Premier League Matches — IPL cricket data exported to CSV files from SQL server (577 matches up to season 9)

NASCAR Champion History (1949 — Present) — 67 years of NASCAR Champion Season History via Wikipedia

Does Pro-athlete Fame Correlate to Results? — Data Project to calculate win/loss ratios for players and compare them to the various stats in the ESPN Fame 100

NFL Data —Data Project to identify trends for fantasy football purposes

Media

Podcasts Dataset — Podcast episodes published between 2007 and 2016

Michael Phelps vs. a Shark — Data Project to compare shark speed times with Phelps’s Olympic times, visualize and synthesize the data, and predict a final victor for this year’s Shark Week race

Do the Best Movies on Netflix Pass the Bechdel Test? — A movie passes the Bechdel test if it has two female characters who talk to each other about something other than men. This project evaluates the likelihood of movies passing the Bechdel test across various factors (budget, ratings, genre, etc.)

Fox News Facebook Shares vs. Likes — This study analyzes the impact a single Facebook share has on a Facebook post

Viz by data.world community member Chase Willden (@chasewillden)

Rolling Stone’s 100 Greatest Metal Albums of All Time — Rolling Stone magazine’s all-encompassing list of the greatest metal albums of all-time

International

Literacy Rates in Telangana — The number of literates among males and females and their literacy rates in each of the districts

Viz by data.world community member Sumendar Karupakala (@sumendar)

Suicides in India — Number of suicides that happened in India by state from 2000 to 2012. Includes detail on social status, education status, and professional profile of those who died

US Immigration Enforcement — Numbers of immigrants apprehended, removed, or returned by US DHS (CBP, ICE) yearly from 1925–2015

Economy

The Essential Landscape of Enterprise A.I. Companies — Companies that also use a wide range of AI and machine learning technologies, ranging from computer vision to NLP / NLU

Stock Facts — Stock Market Facts Combined with Board Members

Viz by community member Rahul Singh (@rahul0404)

Fortune 500 Diversity — Every Fortune 500 company’s 2017 diversity data, or lack thereof

Occupations by State and Likelihood of Automation — 702 SOC (Standard Occupational Classification) jobs, their likelihood of automation, and the number of jobs per State

Post-school Earnings Summary — Over 7,700 rows that detail college name, race percentage, median income, etc.

US County Economic Data Compiler — Data Project to organize, reformat, and create intuitive geographic displays of U.S. county economic data

Politics

White House Salaries — CSV scraped from the 16-page PDF detailing the salaries of Trump administration employees

Party Representation — Data Project investigating “How well does our government represent the people based on their party affiliation?”

Healthcare

Medical Discharge Rates by State — Selected medical discharge rates by state from 1992 to 2015 via The Dartmouth Atlas

Fentanyl Dispensations in New Jersey — Fentanyl dispensations made by New Jersey pharmacies from 2011 through early 2017

NJ statewide overdose deaths 1999 to 2016 — Includes total deaths, heroin deaths, and fentanyl deaths

Other

Bigfoot Sightings — Full text and geocoded sighting reports from the Bigfoot Field Researchers Organization (BFRO)

Viz by data.world’s Noah Rippner (@nrippner)

ANSUR II — @datamil’s ANSUR II database contains 3D whole body, foot, and head scans of soldier participants. The data from this survey are used for a wide range of equipment design, sizing, and tariffing applications within the military

Homelessness Point-in-Time Estimates — National Point-in-Time (PIT) estimates of homelessness, national estimates of homelessness by state, and estimates of chronic homelessness from 2007–2016

NICS Firearm Background Checks — Monthly data from the FBI’s National Instant Criminal Background Check System, converted from PDF to CSV

Are dog size and intelligence linked? — Data Project getting to the bottom of a very urgent question

Vega Viz by data.world’s Sharon Brener (@sharon)

LARA Hotel Reviews — A LARA (latent aspect rating analysis) of Datafini’s open hotel review data

Federal Food Desert Programs —Data Project combining USAspending data with other datasets to identify communities that need support

2017 Total Solar Eclipse Map and Shapefiles — Shows the path of the Moon’s umbral shadow during the total solar eclipse on August 21, 2017

Sunsquatch Challenge — “There are no more eclipse maps to make”… the internet accepted the challenge

Viz by Joshua Stevens

Future Asteroids — All known future asteroids poised to pass near Earth, some being potentially hazardous objects

Tutorials

Python Data Wrangling Tutorial — 5 useful data wrangling techniques using Python Pandas and data.world

SPARQL Tutorial — Learn SPARQL by practicing with data about twelve important people from George R. R. Martin’s Game of Thrones

Titanic Disaster Dataset — Data for exploratory analysis and building binary classification models to predict survival among Titanic passengers

What to put in data.world A Data Project with examples of the four types of data and context you can put in data.world

Introduction to SQL functions and GROUP BY — Introduces SQL functions and then performs aggregations via the GROUP BY clause

Stay tuned for our next Data Digest compilation! If you liked this Digest summary, we encourage you subscribe to our weekly Data Digest email and share your favorite datasets with friends, family, and data enthusiasts alike.

Data work is much easier when everyone can contribute to it. Learn how to use data.world to collaborate with your professional teammates on your data projects here.

--

--