Google launches Dataset Search

Arjun G
REDACT
Published in
2 min readSep 6, 2018

Google has launched Dataset Search, providing access to millions of datasets from thousands of data repositories on the web, including data published by local and national governments around the world.

Dataset Search is quite similar to how Google Scholar works. It works in multiple languages with support for additional languages coming soon.

Google launches Dataset Search | Photo by Samuel Zeller on Unsplash

To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way,” wrote Natasha Noy, Research Scientist, Google AI.

It has data from NASA and NOAA, as well as from academic repositories such as Harvard’s Dataverse and Inter-university Consortium for Political and Social Research (ICPSR).

This type of search has long been the dream for many researchers in the open data and science communities” said Ed Kearns, Chief Data Officer at NOAA. “And for NOAA, whose mission includes the sharing of our data with others, this tool is key to making our data more accessible to an even wider community of users.

Ed Kearns helped NOAA make many of their datasets searchable in this tool.

This launch is one of a series of initiatives to bring datasets more prominently into our products. We recently made it easier to discover tabular data in Search, which uses this same metadata along with the linked tabular data to provide answers to queries directly in search results. While that initiative focused more on news organizations and data journalists, Dataset search can be useful to a much broader audience, whether you’re looking for scientific data, government data, or data provided by news organizations,” wrote Natasha Noy.

--

--