Practical data ethics: summary of what we’re learning

Alanna Williams
The State of CalData
3 min readJun 2, 2022

Coauthored by Sebastián F Gómez Pérez

Most discussions on the ethical use of data in the public sector have centered on AI and algorithmic decision-making. CalData agrees that these are paramount (and our team has done a bunch of work in this space already). But our experience has shown us that there are ethical decision-points across the entire data lifecycle — not just in the use of AI and algorithms.

To inform a comprehensive approach to data ethics, we’ve been reviewing frameworks and toolkits that focus on broader ethical data management (and chatting with the folks behind them). Huge thanks to UC Berkeley MPA student Sebastián F Gómez Pérez who helped via his Capstone Analytic Project; he is a coauthor on this blog!

This blog summarizes our first phase discovery with initial findings, current takeaways, and a call for input. We hope to use these lessons and your input to build a data ethics toolkit for the state community.

Lots to borrow from existing resources

CalData loves to borrow good ideas. Lucky for us there are some awesome frameworks and toolkits out there to learn and steal from (with due credit of course!). We tried to source from a range of efforts, from multinational organizations to statistical agencies and from general frameworks to specific toolkits.

Check out our Ethics Frameworks At-a-Glance, which highlights these materials, their principles, and distinguishing features. Some observations:

  • Almost all are from the US and Western Europe; we would love to diversify our scan.
  • Half (8) are from government organizations, but only 2 are at the state/local level. 3 are from multinationals and 5 from non-governmental organizations.
  • Every resource we reviewed had something we really liked or found notable.

We also appreciated existing framework reviews from New Zealand’s Government, the London Office of Technology and Innovation and the Open Data Institute.

Our current takeaways

Many different approaches. The documents we reviewed have a variety of styles, ethical principles, and organizing structure tailored to their jurisdictional needs. Some, like the ODI’s Data Ethics Canvas are a series of questions to prompt discussion, while others, like the US Federal Data Ethics Framework are best practices documents targeted to their audience.

Generally, existing approaches focus on principles, overarching best practices, and fostering discussion. Most resources provide ways of thinking, recommendations to consider, and tools and methods to generate discussion. One exception is from the UK Statistics Authority’s Centre for Applied Data Ethics. They have a prescribed approach to review all research and statistical projects and collections that uses a self-assessment workbook, discrete guidelines, and supplementary resources. When interviewed, the Centre for Applied Data Ethics emphasized the importance of user accessibility via an efficient tool and concrete concepts that can be simply scored and mapped to action. We like this approach and think it’s a good match for state needs.

The data lifecycle is a practical structure. The equity-centered guides from AISP and the Urban Institute used the data lifecycle as their central organizing theme and many others mentioned the need to assess ethical issues across the lifecycle. We think the data lifecycle structure ensures coverage of the ethical considerations appropriate to each phase. It also lends itself to a modular approach with user-friendly framing. We plan to use the data lifecycle as an organizing framework for state guidance.

Common themes and recommendations are concentrated in early phases of the lifecycle. Almost all frameworks include weighing of public benefits and risks, initial stakeholder engagement, and considerations of legal context, all of which come early in the data lifecycle. The London Office of Technology and Innovation also noted that they’ve seen more focus on these earlier planning stages. There is less consistency further down the data lifecycle into maintenance, use, and disposal.

Hard to ensure and monitor the implementation of data ethics. Several interviewees mentioned that adoption was a challenge. One said she saw this repeatedly in her cross-jurisdictional work and noted a need for developing monitoring mechanisms and implementation tools.

Next steps and call to action

We are still early in the project and have a lot more to explore, including ensuring we hear from Californians, aligning with academic research ethical regimes, and following advice from Dan Morgan at the US Department of Transportation to learn from other disciplines.

In the meantime, please let us know if we missed any key frameworks or resources and provide any more feedback here.

Finally, we will continue to share progress as we learn and ultimately as we develop materials to help the state responsibly steward the public’s information.

--

--