Assembling Accountability, from the Ground Up

Algorithmic impact assessments should leverage diverse expertise & complex histories

Emanuel Moss

Published in

Data & Society: Points

8 min readJun 29, 2021

By Emanuel Moss, Ranjit Singh, Elizabeth Anne Watkins, and Jacob Metcalf

By now, it is a tired trope: Sen. Orrin Hatch asking Mark Zuckerberg how Facebook makes money. Zuckerberg replying with a wry “Senator, we run ads.” Another congressperson grilling Sundar Pichai, CEO of Google, about the Apple iPhone, made by Apple.

Lawmakers, it is said, don’t understand technology well enough to regulate it. They are too old. They are out of touch. They have disinvested from staff and other experts that could help them understand it. And while all those criticisms may be true, why should we expect our lawmakers to become individual experts on every challenge facing society? There are advocacy groups, community activists, forensic technologists, thoughtful developers, and critical scholars who have devoted their careers to building expertise on these issues. Given the burgeoning influence of algorithmic systems over social affairs, and an increasing awareness of the harmful impacts of these powerful systems, we are at a moment in which complex sociotechnical systems require robust, adaptable regulation — and legislatures and regulatory bodies are drafting new rules.

Our new report Assembling Accountability demonstrates a pressing need to establish algorithmic impact assessment practices from the ground up, which requires cultivating and synthesizing a broad consensus of expertise from industry, scholars, and public interest advocates, including people from affected communities.

It is critically important that the methods and measurements of auditing and algorithmic impact assessment are not crafted by industry alone.

It is critically important that the methods and measurements of auditing and algorithmic impact assessment are not crafted by industry alone. Luckily, the history of impact assessment offers lessons for how complex knowledge and expert contributions can be integrated into policy through robust regulatory frameworks.

These policy frameworks oblige stakeholders involved in this type of development to construct methods and measurement practices that make the societal impacts of such projects clear, and provide a venue for public contestation over competing claims about the tradeoffs of development. That way, regulators — acting in the public interest — can glimpse the tradeoffs between the benefits and harms of such undertakings, make well-reasoned choices about those trade-offs, and require changes to projects that would minimize harms.

History: Environmental Impact Assessments

Lawmakers did not need to become individual experts on every dire and complex environmental threat such as water-borne pollutants, co-morbidities due to airborne coal ash, or the remediation of lead in soil contaminated by diesel spills. They did, however, need to create a regulatory structure that gave voice and incentive to communities directly impacted by these issues, as well as to toxicologists, geomorphologists, biologists, and other environmental scientists who could provide environmental expertise to regulators.

When Congress passed NEPA (1970), the environmental protection act that created the EPA, it put in place a structure that gave environmental scientists’ expertise a role to play in understanding the environmental risks of big development projects such as power plants, pipelines, bridges, and highways. This regulatory structure enabled these risks to be balanced against the economic, strategic, and social benefits of such projects. By mandating a thorough, expert assessment of the likely environmental impacts of a project before permitting it to move forward, environmental expertise became a part of the decision-making process inside of government in a manner that allowed regulatory rules to evolve over time in response to new science and public contestation.

But how will these assessments happen? Who will be tasked with conducting such assessments? What kinds of algorithmic harms will be assessed?

When it comes to algorithmic systems, we currently face a similar challenge to those who established environmental regulatory structures: how do we protect the public’s interest in the face of a complex system which is impossible to understand or regulate, while synthesizing potentially dozens of types of expertise with potentially conflicting perspectives? How do we address the risks ADSs pose, which of their uses are permissible, and repair any damage these systems perpetuate? Can society’s collective expertise on these issues be harnessed for this purpose? How shall the damage these systems perpetuate be repaired? Governments around the world are crafting legislation that would require the impacts of algorithmic systems to be assessed in hopes that these systems can be used more safely and their harms be minimized or mitigated.

But how will these assessments happen? Who will be tasked with conducting such assessments? What kinds of algorithmic harms will be assessed? Most importantly, how will these assessments turn into concrete changes in the operation of algorithmic systems to avoid the harms they might produce?

Current Approaches

Currently, algorithmic systems are assessed through a patchwork of methods and regulatory frameworks. Individuals and communities negatively impacted by such systems have raised awareness around their potential algorithmic harms. Investigative journalists and critical scholars have made these harms more visible, contributing their own assessments of how these systems impact society. Developers concerned about particular forms of impact conduct their own internal audits.

Recently, developers have begun contracting with a second party to audit their systems; there is a newly emerging role for auditors who can inspect systems for bias across race or gender categories, or for privacy and security vulnerabilities, on behalf of developers. Advocates and public-interest-minded technologists conduct critical audits of systems, too, but often must develop their own methods for assessing systems to which they lack full access.

Regulations already on the books in Canada task developers with conducting their own impact assessments by attesting to the attributes of their algorithmic systems that might cause impacts. These risk assessments are then used to inform government software procurement. Recently released draft regulation in the European Union (EU) requires developers of “high-risk” algorithmic systems to conform with a set of performance standards. It may also mandate a “conformity assessment,” conducted as a self-assessment or as an independent third-party assessment, to ensure compliance with such standards. In the United States, draft legislation would require “algorithmic impact assessments,” although what such assessments would consist of remains unresolved.

…The harms of these systems are unevenly distributed, emerge only after they are integrated into society, or are often only visible in the aggregate.

What becomes clear in all these efforts is that no matter how these regulatory frameworks around algorithmic impact assessment are structured, impact assessments will only be able to assess the algorithmic harms that we know how to look for. Algorithmic systems present a special challenge to assessors, because the harms of these systems are unevenly distributed, emerge only after they are integrated into society, or are often only visible in the aggregate. Our report explains both the stakes and constitutive components of assembling accountability through the algorithmic impact assessment process.

Assembling Expert Perspectives

In order to be able to look for the widest possible range of algorithmic harms, we need as many sets of lenses as possible trained on this problem. Algorithm auditors have developed finely-honed technical lenses to perceive bias in algorithmic decision-making, and can quantify disparate error rates across a diverse population. Social scientists have deployed qualitative lenses to reveal the ways algorithmic harms subtly (and not-so-subtly) arise from the integration of technical systems into social worlds. Community organizers are experts in their own lived experience and thus analyze algorithmic harms specific to their communities. Developers of algorithmic systems have also devised elaborate instrumentation and metrics to provide a window onto the minute of how their products perform.

Each of these lenses is necessary, but not sufficient on their own. Each perceives only a part of the whole. As norms around core aspects of algorithmic impact assessments take shape through regulation and emerging practices, adequately assessing the full range of algorithmic harms will require assembling insights from this entire community into a coherent perspective.

…The worst-case scenario is not that tech companies measure the impacts of their own systems, but that they get to choose the metrics.

Toward Impact Assessments

In the push and pull about regulation of the tech industry, the debate is often framed as a matter of “self-regulation.” Is the government fundamentally too slow to regulate an innovative tech industry? Can the tech industry be trusted to self-regulate? What our research indicates is that the risk of self-regulation lies not so much in a corrupted reporting and assessment process, but in the capacity of industry to define the methods and metrics used to measure the impact of proposed systems. Inevitably, there will be some degree of self-assessment and/or self-regulation in the development of algorithmic services, especially if impact assessments are required to be conducted ex ante and legitimate intellectual property concerns are at play. However, the worst-case scenario is not that tech companies measure the impacts of their own systems, but that they get to choose the metrics.

Therefore, the task for lawmakers is to create an obligation to assess and to incubate the creation of robust measurement practices that are not wholly determined by industry. In other domains, successful impact assessment practices have been able to organically evolve in response to legal challenges, public contestation, and changing scientific and scholarly perspectives. However, both the proposed new EU AI regulations and the US Algorithmic Accountability Act grant significant power to industry to attest to their own compliance, yet lack adequate specification of how the standards for an adequately robust measurement should be established.

Assembling these various forms of expertise to produce algorithmic impact statements will be a continuous challenge, but it is not impossible.

No legislative body can preemptively define the measurements that would be needed to conduct an algorithmic assessment in every case simply because the expertise needed would be too expansive. But what we can expect — or even demand — is a framework for algorithmic accountability focused on the production of algorithmic impact statements: thorough documentation of the entire impact assessment process, from the methods employed, to the extent of impacts anticipated. Impact statements become the means for stakeholders to enter into contestation over these systems and for holding developers accountable for the consequences of their products. As NEPA shows us, regulatory bodies already have the tools to make room for such expertise. Assembling these various forms of expertise to produce algorithmic impact statements will be a continuous challenge, but it is not impossible. The process can begin by calling all experts in, calling them to lend their expertise to enumerate algorithmic harms and to render those harms assessable as impacts. It can also develop methods for collaboration, consensus, and even, at times, dissensus. In doing so, bureaucracies will not only begin to think with the expertise we collectively already possess, but also create the grounds on which the future of expertise in algorithmic impact assessment will be built.

Emanuel Moss, Ranjit Singh, Elizabeth Anne Watkins, and Jacob Metcalf are members of the AI on the Ground Initiative at Data & Society, which uses social science research to develop robust analyses of AI systems that can effectively inform their design, use, and governance.