Data catalogs. Part 2. Data and metadata standards

Ivan Begtin
3 min readJun 6, 2022

--

Photo by Markus Spiske on Unsplash

There are a lot of data standards related to data science, open data, scientific data, digital assets as data, and so on. It’s nearly impossible to talk about all of them at once, and I’ve collected the most important list.

Data catalogs and metadata publishing standards

Data Cataloging Standards

Geodata publishing standards

API Publishing Standards

Statistical publication standards

  • SDMX https://sdmx.org — international standard for publishing official statistics
  • Data Documentation Initiative https://ddialliance.org/ — international standard for publishing surveys and statistics

Object Description Standards

  • Schema.org https://schema.org/ — a set of standards for describing objects on web pages for indexing by search engines
  • ontology registry http://vocab.linkeddata.es/ — numerous ontological descriptions of subject areas
  • semantic data types registry https://registry.apicrafter.io — registry of semantic data types used to detect PII data and other types of identifiers and dictionary based type

Universal Standards

  • Data Package (Frictionless Data) https://frictionlessdata.io/ is an actively developed and implemented standard for describing and publishing data in the form of standardized data packages. Includes a large number of data preparation tools
  • Network Common Data Form (NetCDF) https://www.unidata.ucar.edu/software/netcdf/ — scientific data publishing standard used since the late 80s
  • BagIt https://tools.ietf.org/html/rfc8493 — RFC 8493 standard for packaging digital objects. Actively used by government and academic archives, such as DataOne and the US Library of Congress

Industry Data Standards*

*not all industry standards are listed here, as there are quite a few of them, but only the most noticeable

Data Standards Groups in Government

--

--

Ivan Begtin

I am founder of APICrafter, I write about Data Engineering, Open Data, Data, Modern Data stack and Open Government. Join my Telegram channel https://t.me/begtin