Open Data 101: What it is and why care about it
In this post we will take a brief look at what Open Data is, the trends driving its growth, why it is important, and some of the main implementation challenges.
This is the first in a series of short articles that will summarize some of the promises, challenges, tools, and outstanding questions related to the rapidly developing field of Government Open Data (Open Data). In future articles, we will take a deeper dive into more specific areas within the Open Data field.
Objective of these articles
These articles will be geared towards those who are new to the field, and looking to get an overview of the trends, issues and challenges of Open Data. These articles will look at the emerging field of Open Data through a strategy and policy lens, rather than a technical one and they will view the field as a whole, rather than diving deep into specific data domains.
During my career, I have gained insight into the common challenges that Government agencies face when rolling out a new system or set of business processes. Over 15 years I have worked on various Government IT Transformation projects in Australia, the UK and the USA. Over these years I have had roles including Development, Consulting and Project Management, on IT projects which have ranged in size from tens of thousands of dollars up to over one hundred million dollars. I have experience in all stages from initial Strategy and goal setting, through to Implementation, and Post Implementation.
Over the past two years I have had an increasing fascination with the world of Open Data, and its related fields. I have read hundreds of articles, reports, participated in hackathons, and helped peer review publications in the field. I have spent a lot of time exploring this field and trying to grasp how it all fits together. In writing these articles, I am hoping to encapsulate what I have learned and then to contribute new thoughts and ideas to the field.
What is Open Data?
Depending on who you ask, you may get varying definitions on what Open Data is, as different organizations may have disparate motivations as to why they are opening up data in the first place. For example, whereas one organization might be focused on innovation, another might be focussed on improving transparency, or reducing FOI costs. However, there are some general agreed upon principles common to most definitions.
In the “The Global Impact of Open Data”*, an analysis was conducted of some of the more prominent definitions used for Open Data around the world. Following from this a definition was crafted to synthesize the key aspects of the definitions used across the field:
“Open data is publicly available data that can be universally and readily accessed, used, and redistributed free of charge. It is structured for usability and computability.”
To break down this definition:
- “.. publicly available data ..”
In many areas data may have been previously available to the public, only it was costly, hard to access or use.
- “… can be universally and readily accessed, used, and redistributed free of charge…”
The data can be used and redistributed without any licensing costs. There are no lengthy retrieval processes and no administrative FOI costs.
- “… it is structured for usability and computability.”
Open Data is data that is presented in a user friendly format, using common standards. Anyone who has any experience with data science will know how much time can be spent on cleaning or structuring data so it can be analyzed.
Another useful definition is provided in the Open Data Charter (www.opendatacharter.net) which builds upon a set of principles outlined by a G8 working group and outlines the following six principles of Open Data.
- Open by default
- Timely and comprehensive
- Accessible and usable
- Comparable and interoperable
- For improved governance and citizen engagement
- For inclusive development and innovation
What has been driving the growth of Open Data?
Government sharing data is not new. Government has been collecting and making available public information for a long time**. But what is new, is the scale and scope of the data being opened up, as well as how it is being released and made much more accessible. The recent and seemingly exponential growth in the field has been driven by technology changes which allow easier storage, manipulation and presentation of data. Some of the most significant technological changes are listed below:
- Exponential growth in computer processing power and reduction in data storage costs. This allows massive amounts of data to be digitized, copied, and manipulated.
- Sharp growth in the update of smart phones, and parallel growth in application market places. Mobile access to data and information has increased. Now there are more phones then people on the planet, and billions of applications available for download.
- Growth of Internet of Things (IOT). Data that has been captured, real time, and in a machine friendly format using IOT, has created Open Data opportunities. We are only seeing the start of this trend. IOT will have a big impact in the future.
- Software that makes extracting value from data easier. Two burgeoning areas of new software are Data Science and Data Visualization. Parallel to this is an explosion of training available in these fields and more jobs opening up.
Parallel on-technological trends which are driving the advancement of the Open Data field include:
- Proactive funding by large philanthropic organizations. Organizations such as Bloomberg Philanthropies, The Knight Foundation, Omidyar Network and many others funding efforts have led to research, proliferation of publicly available resources, the development of international Open Data organizations, and also directly given grants to government entities open up their data.
- Improved co-ordination, cooperation between jurisdictions. This leads to improved data standards and knowledge sharing between entities
- Non-profit, community and civic based technology groups. Barriers to learning new technologies are falling, software developers have new and effective ways to work productively together, and there is a growing movement of civic minded individuals wanting to improve the communities where they live.
Open Data is here to stay, and will likely continue to grow in prominence in the future as more people become aware of what it is, and why it is important.
Why should we care about Open Data?
Put simply and bluntly, we should care about Open Data because we benefit from it considerably, and as a public good it can almost every individual in one way or another. Checkout this fun summary from a blog article by The Open Data Institute: http://theodi.org/blog/its-open-data-day-but-what-is-open-data-and-why-should-we-care
The promised benefits of Open Data and how much they have been realized will be covered in a future article. In summary, these benefits can be categorized into three broad categories:
- Individuals benefit in many aspects of their own lives. Many people already benefit directly from Open Data each day. From waking up in the morning to check the weather forecast on their smartphone, to checking the public transport times on the way to work. For example, someone maybe looking to move into a new neighborhood and interested to see what the crime levels are and what education options are nearby, or they might be wondering how their political representatives have been voting on areas of interest.
- Citizens relationship with their Government (transparency and accountability). Transparency and accountability are cornerstones of any high performing organization — especially any public organization.
- Government can improve its own performance. Governments agencies that share information with each other in an economical and efficient manner can find improvements in their outputs. Ultimately, this means citizens are better served.
Although it an exciting field that shows a lot of promise, opening up and releasing open data does not come without its own challenges.
If Open Data is so great — then why isn’t more government data already Open?
Two reasons for this, firstly the challenges in opening up Open Data, and secondly, Open Data is a nascent field where the potential benefits aren’t already widely known and accepted. This article will focus on the first of these two reasons.
The process of opening up agency data can be complex, and this complexity can often arise from non technical challenges. Below is a high level snapshot of some of the most common stumbling blocks — which include:
- Senior level political support. With any IT transformation initiative, the Chief Information Officer (CIO)/Chief Data Officer (CDO) needs steady and public support from top leadership. Change requires effort, resources and often workers are reluctant to change. Without firm and explicit support a change initiative can lose momentum and fail.
- Sufficient resources. As with any project, a critical success factor is to have an understanding of what resources may actually be required, and then ensuring they are made available. This includes hardware, software, and appropriate staffing (technical, policy, training, external communications, project management).
- Competing priorities. The CIO/CDO can be expected to run the business as usual as well as implement changes. They will need to understand how important the new initiatives are compared to existing activities.
- Data issues (pre-release). Data privacy, data quality, data completeness issues require more than a once off effort, but establishing ongoing policies and procedures to improve data quality over time and ensuring only appropriate data is released.
- Data accessibility and usage. A successful Open Data initiative, depending on its goals, often has much more work to do even after Data has been opened up to its citizens. It is critical to inform, engage and work with the public to improve the data and its release cycle on an ongoing basis. There are numerous aspects to this, such as marketing, publicity, training, and establishing feedback loops.
This list is far from exhaustive of what can go wrong and every IT project is different, a future article will do a deeper dive on recommendations on how to prepare for and implement an Open Data initiative.
Where to next?
Open Data is an exciting, rapidly developing, and promising field. Over the coming weeks further articles will do a deeper dive into some of the pertinent questions facing open data, such as:
- Expected benefits of Open Data.
- Challenges of measuring success of Open Data initiatives.
- Implementing Open Data, the main steps and challenges.
- Current state of Open Data around the world.
- Recommended freely available resources to get started with Open Data.
- The role of traditional IT Industry associations and the emerging field of Open Data.
If there is a particular subject you would like this blog to focus on, or if you have any other feedback, please comment below.
References and footnotes
- *Verhulst,S and Young, A. September 2016. The Global Impact Of Open Data. Key Findings from Details Case Studies Around the World. Published by Govlab, Omiydar Network and O’Reilly Media.
- ** Two examples where Government has long been releasing data are the weather data that fuels our forecasts, and much earlier, the census, which collects personal data used for government decision making back for centuries.
- Scott, A. What is ‘open data’ and why should we care? February 2015. The Open Data Institute. Retrieved February 2017 from http://theodi.org/blog/its-open-data-day-but-what-is-open-data-and-why-should-we-care
- Manyika, J, Chui, M. et al. Open data: Unlocking innovation and performance with liquid information. October 2013, McKinsey Global Institute.
- The International Open Data Charter. Retrieved February 2017 from www.opendatacharter.net.
- What’s The Value of Open Data? October 2013. GovLab. Retrieved February 2017 from http://thegovlab.org/whats-the-value-of-open-data/
- Headd, M. Open Data Guide. Retrieved March 2017 from http://opendata.guide/
- The Open Data Institute. Guides. Retrieved January 2017 from https://theodi.org/guides