Data Clinic
Published in

Data Clinic

The Hidden Stories of New York as Told Through Open Data

Data Clinic reflects on some of our favorite NYC Open Data projects to celebrate NYC Open Data Week 2022

Founded in downtown Manhattan in 2001, Two Sigma and much of its community are proud to call New York City home. Our history is intertwined with the city’s and we recognize how much we owe to the place where we laid our foundation. Our Data Clinic team has worked hand-in-hand with partners throughout the city to give back to our community for a number of years, but more recently has begun to build tools to uncover the hidden stories and insights that exist within the city’s open data. As we join our fellow open data enthusiasts at NYC Open Data Week, we reflect back on some of these efforts and the opportunities they offer to empower the NYC community:

Subway Accessibility

Inspired by the International Day of Persons with Disabilities, Data Clinic hosted a hackathon on the NYC Subway system’s accessibility. Data Clinic volunteers built subway station maps, cleaned turnstile data, and developed a vital naming convention crosswalk to better understand NYC subway elevator accessibility. Read more.

Reimagining NYC Neighborhoods

NYC’s neighborhoods are a driving force in the lives of New Yorkers — their identities are closely intertwined and a source of pride. However, the history and evolution of NYC’s neighborhoods don’t follow the rigid, clinical lines of statistical and administrative boundaries that are often used to identify them. From a statistical standpoint, grouping and analyzing data at these administrative boundary levels may not be ideal for analysis. The Data Clinic team therefore developed NewerHoods to reflect the conceptual and statistical benefits of using a data-driven approach to identifying neighborhoods. Read more.

Bullying in NYC Schools

As part of Open Data Week 2018, the Data Clinic team hosted a “State of Open Data on School Bullying and Harassment” event, which featured a comparative analysis of federal and local datasets, followed by a panel discussion on what open data can reveal — and conceal — about this important school safety issue. Notably, the team showed that fewer than a quarter of the city’s 1,700+ public and charter schools reported a single incident of bullying or harassment to the US Department of Education’s Office for Civil Rights in a given year. Read more.

Subway Ridership Trends

NYC’s subway system is the heartbeat of the city, shifting millions of people between their homes, schools, and workplaces each day. Although the pandemic altered much of city life, the subway system continued to provide transportation to those who still needed to go into work everyday. The Data Clinic team used our open source tool SubwayCrowds to estimate subway crowdedness and create visualizations of the average weekday ridership numbers during the pandemic at different stops over the course of a day. Read more.

New York City’s Loudest Holiday

From spectacular fireworks on the 4th of July to the crowds of Time Square on New Year’s Eve, New York City knows how to throw a party — especially when people have a few days off! The Data Clinic team dove into the open data to determine what holiday does New York City celebrate the most. Read more.

Get Involved

We look forward to continuing to build tools and share analyses that provide our beloved city with opportunities to continue to grow and lift up members of our community. Visit the Data Clinic GitHub to learn more about and contribute to these open source projects and more.

To join our NYC Open Data Week workshop this Thursday March 10th from 11am-12pm ET in which we’ll unveil the latest features of our open source open data discovery tool scout, RSVP here:




As the data- and tech-for-good arm of the financial services company Two Sigma, Data Clinic provides pro bono data science and engineering support to nonprofits and engages in open source tooling and research that contribute to the broader Data and Tech for Good movement.

Recommended from Medium

Features of ready-made templates projects-dashboard and projects-timeline from DOIT-BI

Importance of data science

Life Cycle of Data

Model Evaluation Metrics Used For Regression.

Why are graph databases hot? Because they tell a story…

Waterproofing Data Project: Sharing experiences and accessing impacts

In #LoveDataWeek, celebrating four years of the Data Impact Blog with a look back — Data Impact…

Do You Need To Manage a Team of Data Scientists?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Data Clinic

Data Clinic

As the data- and tech-for-good arm of Two Sigma, we harness the power of data and technology to help nonprofits have a greater impact.

More from Medium

A data scientist’s study of Telangana fires

Does Bayesian Probability of Success Help in Drug Development?

Show HN: What I Learned Posting to YCombinator’s Forum

AI Integrity: Leadership Lessons from Other Industries