Holding tight to purpose in a changing world

by Paul Stone, Co-chair of the Implementation Working Group

This blog is one of a series of reflection pieces by open data leaders involved in the process to update the principles, designed to inform debate.

There has been a lot of discussion during the Open Data Charter Principles refresh consultation around the role of open data in the dawning age of artificial intelligence, machine learning, and complex predictive models, all loosely referred to as “algorithms”. The question is: Can and should the Charter Principles try and cover this arena?

Firstly, we must remember why the Charter and its principles exist in the first place, and what they are trying to achieve. Then we must also remember that principles, in this context, are a foundation for a system of belief or behaviour. They are not a how-to guide, but rather a guiding light.

Why do we have the Charter Principles?

So, why do we have the Charter principles? The Charter was developed within a broad participatory process which actively engaged governments and civil society from around the world (read more about its history…)

They Principles were developed

“to represent a globally-agreed set of aspirational norms for how to publish data”.

Through these norms it is hoped that governments around the world will be encouraged to be more open and transparent, and be supported by a community of peers striving for the same.

Why are we all striving for openness and transparency? For a number of reasons. For many the first reason to come to mind is to combat corruption through transparency and accountability. This is extremely important in itself, and for any government that is genuine about serving its people, open data and information is one of the best tools to build the trust of civil society in them.

There is more to being an open government through open data though, and it’s through the way it can enable more participation at different levels. For a start, open data can lead to civil society being more informed and able to contribute more effectively in government decisions.

Another way open data can lead to more participation is through data about government activity. For example, policy consultations, petitions, select committee hearings, and readings of bills in parliament are all government activities where civil society can participate but most people don’t because they don’t know when something they are concerned about is being worked on. If someone could register what topics they are interested in (such as early childhood education, climate change etc.) then data released about these activities would enable tools to be developed to alert people when they can get involved with something they are interested in.

However, the real power of open government is realised when government and civil society can collaborate on solving problems together. Open government data — whether aggregated statistical data, real time sensor feeds, location of community facilities and infrastructure, registers of government services etc. — is a significant component to enable private businesses, community groups and individuals to work together with governments to analyse problems and build solutions.

The Charter Principles and AI

Photo by Alex Knight on Unsplash

So, now that we have a good sense of the purpose of the Charter principles, we can explore more the scope of what they should cover. When it comes to the discussions around open data and artificial intelligence, its key to remember that the principles are to support “how to publish data”. They are not meant to influence how data is used in AI.

Once you get into trying to be a guide for good use of open data, where would you stop? Complex algorithms for decision making or prediction are just one use of data. Every day, data is being used in a multitude of ways through many different mechanisms such as mobile apps, web services, research, advocacy etc. Is it really the role of the Charter to influence all these domains?

There are ways to influence data use behaviour through focussing on publishing practice. The principles already address privacy concerns, whether strongly enough is currently being debated in the Charter refresh process, but it is appropriate to do so. By ensuring individuals’ right to privacy through appropriate methods to prevent the re-identification before publishing, we are ensuring people don’t reuse data in a way that breaches privacy.

When it comes specifically to the use of data in algorithms, there is a mechanism through publishing data that can influence the transparency of data reused in this way. The mechanism is “attribution” in open licensing, for Creative Commons Attribution (CC BY 4.0).

Attribution requires the user of data to acknowledge where the data (or other content) came from and to indicate if any changes were made. So by publishing data under a licence requiring attribution, we are requiring (legally) that the builder of the algorithm acknowledge where the data they are using came from. So, when a set of algorithms are making decisions about people, there is at least the opportunity to examine the open data that is used and test whether the dataset is fit for this purpose.

The transparent and ethical use of data discussion is not only occurring within the Charter refresh, but also amongst a growing number of AI forums around the world, and some governments agencies and forums are already collaborating to address this issue themselves. For example, Stats NZ and the New Zealand Privacy Commission have jointly published Principles for the safe and effective use of data and analytics and the Data Ethics Framework in the UK.

In conclusion

The Charter was established to be a guiding light for governments and other organisations, and to encourage more proactive publication of safe, quality data. The Charter can also influence some behaviour of those that reuse data through the way its members publish data.

There is a danger in trying to develop principles for all facets of data in one place. The purpose of the Charter is about proactive, safe release of data and all the benefits that brings. Let’s focus on that and do it well. Let’s do our part and create the opportunity for great works to be done with data.