netCDF is commonly used in environmental applications from climatology to oceanography, and more recently in the geosciences. If you’ve ever used it, you’ll know that there are lots of tools available. A feature of the user community is that many have adopted conventions to encode certain information, e.g. Climate & Forecast (CF) Metadata Conventions and the Attribute Convention for Data Discovery (ACDD). Data encoded using conventions like CF and ACDD contain a lot of useful metadata that could be leveraged to provide enhanced discovery, integration and understanding using Semantic Web technologies, but right now the information is locked away in individual files as they don’t link up to anywhere else.
Conventions like CF and ACDD are not likely to be sufficient to cover all possible vocabularies and schemas for data stored in netCDF files for all domains, nor should they try to be. Also, other relevant vocabularies and schemas exist, are maintained by various institutions and communities, and have their own lifecycle and usage. Without a mechanism that allows different vocabularies and schemas to play nicely together, data authors are often forced to choose between conventions or end up presenting users with a confusing and possibly conflicting mix of namings and interpretations.
netCDF-LD is an approach for constructing Linked Data descriptions using the metadata and structures found in netCDF files. Linked Data is a method of publishing structured data on the web so that it can be interlinked and become more useful through semantic queries. It uses the W3C Resource Description Framework (RDF) standard to express the information and relationships. netCDF-LD enhances netCDF metadata, enabling information found in netCDF files to be linked with published conventions and controlled vocabularies used to express the content.
netCDF-LD is being developed by a working group made up of myself and other members from CSIRO, UK Met Office, NOAA, NCI, Geoscience Australia and others. The working group is contributing to a stream of activity under the Advancing netCDF-CF for the Geoscience Community. Participation is open to the public, and we’ve been having regular telecons to work out the pieces that will form netCDF-LD.
There is a set of relevant Github repositories and is the main collaboration site for this work. The working group’s current focus is to define a specification to incorporate the ability to prefix and alias values in netCDF metadata headers so that Linked Data descriptions are able to be derived and thus joined up with the broader Linked Data cloud. The group is also developing tools in parallel to demonstrate that this approach can co-exist with current practices but add value by enabling users with enhanced views and access to netCDF data from a given file and across data repositories. In this way, it provides users with views of netCDF metadata that can be linked with other netCDF metadata and related information on the Web.
Jim Biard at NOAA on behalf of the working group is presenting at the AGU Fall Meeting 2016 next week on Linking netCDF Data with the Semantic Web — Enhancing Data Discovery Across Domains. Look him up if you’re going to be there and check out our poster.
Drop me a line if you’re interested to learn more. Else, catch us online at http://tinyurl.com/netcdf-ld.