Signal 3: West Coast vs. East Coast, Open Data Edition

Martha
Civic Analytics 2018
2 min readSep 29, 2018

Building on last week’s thoughts on the limitations of New York’s Open Data system, I thought it would be interesting to explore the potential of a true data catalog for New York City. New York City agency data is siloed across multiple systems with no central catalog. While the Open Data catalog enables discovery of specific public data sets, it’s nearly impossible to search for a field name across multiple datasets, or to search within metadata or data dictionaries to find datasets that could be linked together with a common field. And outside of Open Data, there are limited ways for agencies to share this information.

It’s such a beautiful silo though.

San Diego, however, is working with an interesting tool for their data catalog, which sits on top of the city’s databases and allows for automated data cataloging with human annotation and information sharing for queries and metadata (Alation: Product.). They’ve used it to aid in data cleaning “(because friends don’t let friends clean data over and over manually)” and to document their data sets (Pecherskiy, 2016).

Things that San Diego has that we don’t: a functional data catalog, palm trees.

I think leveraging a tool like the one San Diego is using would go a long way to increasing the utility and discoverability of our city’s datasets, both open and administrative. While there would certainly be hurdles in terms of data privacy and sandbox issues, harnessing the power of interlinking data across agencies could lead to better utilization of existing data assets and more effective delivery of services.

References

Alation: Product. Retrieved from https://alation.com/product/

Pecherskiy, M. (2016, -11–01T19:38:30.748Z). StreetsSD. Retrieved from https://medium.com/datasd/behiind-the-scenes-of-streetssd-f0c7a8c59708

--

--