Algorithmic transparency and public policies

A necessary conversation for governments, citizens, journalists and scientists

[this article was originally published on Valigia Blu, in italian]

source: Diliff, Wikipedia cc-by-3.0

James Vacca, member of the New York City Council, the legislative body of the city, has recently proposed a bill that aims at making the algorithms used for public decisions more transparent. The project gathered a lot of interest, and public hearings scored record presence, and it is proceeding in its route to being approved by mayor De Blasio. The discussion during the hearings are really worth listening.

New York uses a number of software instruments to take decisions, or at least to suggest them, in terms of public policy. The use of cutting-edge technology is highly appreciated if looking at efficiency, but it is no news that, if left unchecked, can hide insidious problems.

Vacca, during his opening speech and the presentation of the bill, refers to two highly relevant examples. The first is the application of an algorithm to determine the fire department services, used also to decide the number of police force in different districts.

Despite Vacca has been a high-level administrative figure for years, he explains, he has not been able to get an answer to some quite clear questions: what are efficiency criteria? How is the decision taken? “There’s a formula” (RAND formula)— is the only explanation given — without the possibility to clarify which variables were considered, and without the possibility for citizens to audit the process.

Same happens when a teenager tries to get accepted to the his or her preferred high-school, and is assigned to the sixth or seventh choice.

Rightfully, Vacca asks: “why shouldn’t one be able to understand — and therefore to contest — the decision?”

There are obviously many other applications: from public housing to justice. In the case of algorithms used in court rooms, there have already been several uses and widely discussed controversies: Eric Loomis was charged six years by a proprietary algorithm, that indicated in him a propensity to recidivism, without any right to contest the decision.

Algorithms are used intensively in predictive policing, estimating the risk of escape or crime-committing. Problem is, as well explained by ProPublica, that these techniques introduce a racist bias.

Clearly, these softwares are used to respond to complex issues (even just in a sole quantitative dimension) with an increased efficiency. But it is exactly where injustices seem to lie, that it would be fair for citizens to understand the rationales.

The importance of algorithms in our lives has increased steeply, and it will more and more in the future, even though this risks to be a niche topic, a missing discussion. The critiques to opaque algorithms, in terms of responsibilities are mounting, rightfully, towards the Silicon Valley, where they are in between consumers and corporations. However, what are the questions we have to ask ourselves, when they are handling the relationship between citizens and public administrations? How should this relationship be regulated? What are the responsibilities of policymakers in terms of transparency and accountability?

I know it might sound too far from the Italian current from the state of affairs, but I believe it’s better to reconsider. To make an example: Bank of Italy, a few weeks ago, published a research that re-established the targets of Renzi’s 80 euros tax-rebate, using machine learning techniques. It’s a very interesting study that aims at increasing the efficiency of this measure, by establishing the target in a data-driven manner, redefining who’s considered “in need”. The estimates are that 29% of the cost of the policy (around 2 billion euros), was addressed to non-ideal households.

But what is “ideal”? The key characteristic, for a policy that aims at stimulating consumption, is the propensity to consume of a household. The technique used is that of decision trees, and we can already appreciate a concern of the authors for communicability and transparency of this process.

Going back to New York and Vacca’s proposal, we have to say that the problem is not only relevant, but also complex under many point of views: firstly technological, but also legislative, economic, ethical and of security.

Algorithms are often sold by third parties that would never want to share the code. The relationship between suppliers and public administrations is obviously commercial, so based on the selling of proprietary software: this process, though, excludes citizens from access to knowledge. From data input used to the methods, and therefore to the possibility to appeal.

And it is fundamental that citizens gain this access, because, as it has been clear for a long time, neutral algorithms do not exist.

Vacca sums up this idea in a sentence that I find very effective: “algorithms are a way to encode assumptions”.

But the topic here is even more subtle, because opacity is not only a product of private interest: algorithms are by nature complex objects, for most of us. But if we reckon that access to knowledge, in terms of public policy, is something to protect, it necessary to act.

But how? It is not a trivial issue: transparency does not mean publishing a few thousands lines of code. It would for sure be a starting point, it would enable hackers and journalists to work on it. But the real transparency is not in code: it more generally lays in the accessibility to the decision process, to the assumptions made.

First of all, the code must be documented. Open source without documentation is not necessarily open, as stated by one of the discussants in the talk.

That is just the beginning. Among the proposals, for example, there are simulation systems in which the citizens can understand what changes based on changes on the outputs. Talking about data, another fascinating aspect of the issues comes up: if transparency is ideal, we might say there is a trade-off with privacy. How could we (as citizens and as policymakers) take care of this?

During the discussion, a representative of open-source world explains how open technologies can help not only to better accountability, but also security levels. In a post-Equifax world, it is straightforward to assess the necessity to have clearly established relationships between third-parties and public administrations, when sensible data (and hence lives) of millions of citizens depend on opaque or vulnerable IT structures.

I reckon we can make a parallel comparison with the discussions about the Freedom of Information Act, which responds to the citizens requests for public data: it must involve not only tech experts, jurists and policymakers, but also journalists, activists, citizens.

There is no simple answer, but if we want to build an open and inclusive democracy, thinking about what ‘government’ might mean in the age of information, I believe it is a conversation worth having.