The Many Meanings of ‘Open’
Open Data, Open Source, and Open Standards
Over the past decade, much has been said about open data, open source software, and open standards. So much, in fact, that many people have begun to use the terms interchangeably. But open data, open source, and open standards are not synonymous and should not be conflated.
The confusion poses a challenge for many organizations, in particular, those which lack technological expertise but nevertheless work on global issues that seek out “open” digital solutions. In this article, we define the parameters of open data, open source, and open standards, and identify the key differences between them.
The Open Knowledge Foundation defines open data as “data that can be freely used, shared and built-on by anyone, anywhere, for any purpose.” In these instances, data are shared with an open license, such as the public copyright Creative Commons license.
A key principle of open data is that the data must be available and accessible in an easily digestible format. Unfortunately, in the case of open data, “open” does not always mean “accessible.” Rather, it simply means that the data are available for anyone to use — provided you know who to ask. Making data accessible instead of merely available requires actively evaluating, considering and removing the barriers that prevent people from acquiring it. In March 2016, a consortium of scientists and organizations published the “FAIR Guiding Principles for Scientific Data Management and Stewardship.” The FAIR Data Principles are a set of guiding philosophies to make data findable, accessible, interoperable and reusable (Wilkinson et al., 2016).
Another term that’s often conflated with “open data” is “data democratization.” These terms are not necessarily analogous. It depends on the context. When commercial data suppliers use the term, they typically mean that data are persistent, easily available and easily consumable, but not free. When governments use this term, however, they typically are implying the data should be free.
Examples of open data include Open Street Map (OSM), which is licensed under the Open Data Commons Open Database License. In OSM, data are contributed by a global community of mappers. While each individual contributor holds the copyright to the data, they license it out to the OSM Foundation under its open license policy.
Other examples of open data are the joint NASA/USGS Landsat program and the European Space Agency’s (ESA) Copernicus program, both of which provide open imagery records of Earth’s land surface and atmospheric data. Data are made available as part of an agency decision, such as NASA’s Data and Information Policy and ESA’s Data Policy for Earth explorer missions.
Published and accessible open data can be consumed by any interested party. For example, OSM data are used by private companies that provide geodata software and services to their customers, such as Mapbox.
In the case of open Earth observation data that are furnished by government agencies like NASA, a data consumption example includes Global Forest Watch, an online platform that provides data and tools for monitoring forests worldwide. Commercial companies like Google also use open government data in their products.
Open source software is software whose source code can be publicly viewed, shared or edited. It typically is distributed with a license that gives users the right to modify it. The global nonprofit Open Source Initiative has 10 specific criteria for open source software classification, including free redistribution of the software and technology-neutral licenses. The license must allow for software modifications and derived works that might be released privately.
Open source software is frequently developed using a collaborative model. For example, the web browser Google Chrome is built on Chromium, the open source software that was created by a community of developers, including those working for Google. Chromium is the base version that anyone can build on top of to add additional features. Google built upon it, adding additional features and releasing Chrome as a proprietary product. Microsoft’s new browser is also based on Chromium.
Another piece of open source software is QGIS, a free-to-use application that allows anyone to visualize and analyze geospatial data on all major operating systems. QGIS is released under the Creative Commons license, allowing users to share and adapt its source code for any purpose — even commercially — or to use the application in its baseline version.
It is important to note that open source software is not always “free” software. The difference is in the licensing and the level of effort required to customize the code for your use case. According to GNU progenitor and software freedom advocate Richard Stallman, free does not mean non-proprietary but rather suggests that “users have the freedom to run, copy, distribute, study, change and improve the software” for any purpose. (“This is a matter of freedom, not price, so think of ‘free speech,’ not ‘free beer,’” Stallman says.). One also has the freedom to sell the software after modifying it. Implementing open source software inside a business enterprise frequently requires customization for your organization’s workflow. Whether this customization is done using internal resources or with the help of external consultants, it typically is not free, nor is the subsequent maintenance of the software.
Successful open source software is designed and built using a collaborative community software development process that releases frequent updates to improve functionality and reliability. The key is in the “community” adoption and development.
Open standards are specifications designed to enhance interoperability and maximize utility between digital systems. Similar to open data, open standards should be publicly available to any person or organization, free of cost. Furthermore, data formats should be independent and free from legal clauses that may limit their adoption.
“A standard is like a blueprint that guides people who build things,” experts from the Open Geospatial Consortium and the Open Source Geospatial Foundation write in a joint white paper. That means that open standards should have detailed plans and instructions that anyone can follow to implement them.
Developing standards requires a global community comprising all sectors working together to create specifications and collectively agreeing upon requirements.
A good example illustrating the usage of open standards is the HTML Living Standard that was developed by experts working for major browser companies like Apple, Google, Microsoft, and Mozilla. This standard ushered in specifications such as Hypertext Transfer Protocol (HTTP), which allows a web user to request content on the web through what is now known as a hyperlink.
An open standard of note that will positively impact the geospatial community is the SpatioTemporal Asset Catalogs (STAC) specification, a more mature and stable version of which recently was released. Essentially, STAC enables geospatial datasets to be searched and discovered across various archives.
Clearly, the future of “open” technology is bright. In order to make that future as powerful and as prosperous as it can be, however, organizations both within and beyond the geospatial community must strategically understand and exploit the differences and distinctions between open data, open source, and open standards.
- A Key Difference Between Open Data and Open Source
- NASA’s Data and Information Policy
- Open Source vs. Open Standards
- Open Source vs. Open Standards: Know the Difference
- Open Standards vs. Open Source: A Basic Explanation
- Open Standards vs. Open Source: Why So Confused While Choosing a Tech Stack?
- The FAIR Guiding Principles for Scientific Data Management and Stewardship
- The Real Meaning and Process of Data Democratization
- Use Open Standards, Open Data, Open Source and Open Innovation
- What is Open Source?
- Open Source or Open Standards? (Yes!) The Future has Arrived