State Capacity and Data Control in the Age of AI
Governments lack key abilities to manage data processing, even their own.
By Open Data Charter (Natalia Carfi), Joseph Foti, and SilvanavF (Silvana Fumega)
“You must first enable the government to control the governed; and in the next place oblige it to control itself.”
— James Madison, one of the framers of the US Constitution
This quote captures the challenge of setting up any governance structure. Too little power and you get chaos. Too much and you risk abuse. While Madison certainly was not writing about data governance, his observations apply.
Controlling data is a double-edged sword. If regulators have too little power, relative to whomever they are regulating, they will be ignored. Too much power, and they may themselves abuse public values. Getting those elements in balance is difficult.
The stakes are enormous. Governments are among the biggest data controllers in the world, likely responsible for collecting, managing, producing, transfering, releasing, and destroying more data than any major tech company. They need to be able to decide when to protect data and when and how to give people access. They also need to be able to control the excesses of the market and fight organized crime. Getting this balance right requires investment in state capacity.
Why it matters
- Protection of civil liberties: Governments need to be able to protect vulnerable groups and privacy. Recent events show they are quicker to shut down data access than they are at protecting the public against corporations.
- Coordination: Agencies need to share data. They also need to make sure it speaks to other data. Sometimes they need to abstain from sharing data.
- Efficacy: Agencies need to carry out their legal mandates. Knowing what new forms of collection and processing could help them is essential. At the same time, public employees must know how to color within the lines. The right rules spur innovation.
Decision points
Governments need to be able to address three questions for every set of data that they control.
- Should it be open? When should data be open, taking into consideration privacy and other concerns? Here, we may see some signs of progress, as shown by the Global Data Barometer and the Open Data Inventory. But the same sources show us how much distance we have to travel.
- When should data be anonymized? When should agencies release anonymized data? When can synthetic data be a useful substitute? Anonymization is essential when a clear public purpose weighs against the rights of individuals or groups. However, the value of anonymized data is less clear when the goal is to identify specific wrongdoing, such as contracts fraud.
- When to scrape? When does unsupervised learning on public data constitute an illegal or non-consensual search? Should users or information providers have a reasonable assumption that their data is not free to harvest? Should governments have limitations? What about contractors or research institutions?
The answers to these questions are not clear and, very likely, not uniform across the state. What is clear is that governments need to have the capacity to regulate themselves as well as the private sector.
Find out more
- Fragmentation: The EU has a growing body of laws like the Digital Services Act and the General Data Protection Regulation. How the state applies these laws to itself is becoming clearer, with standard approaches and authorities. Outside of the EU, however, oversight is likely to be piecemeal, scattered across institutions. For example, at the Global Summit on the Future of Artificial Intelligence, US Vice President Kamala Harris noted that the US was missing a single data protection law like other countries. In the absence of such a law, she added that the US does have “existing laws and regulations that reflect our nation’s longstanding commitment to the principles of privacy, transparency, accountability, and consumer protection.” There are ways to push the needle forward on reform even in the face of fragmentation.
- State of play: What is the current legal landscape for data protection and data sharing? Find out who has data protection laws in Africa and how the Global Data Barometer measures data protection and data sharing.
- Data for AI: Finally, learn more about the role of government as a data provider for AI from the Global Partnership for AI.