The Ethical Engine: Integrating Ethical Design into Intro Computer Science

I am an Assistant Professor of Computer Science at Bucknell University, a liberal arts university in central Pennsylvania. This work was designed alongside Bucknell undergraduate Gabbi LaBorwit (‘20), and the code is publicly accessible at You can learn more about me on my website or on Twitter.

Its seems like every day now, I’m reading new articles about developers designing algorithms that can easily stray into biased or discriminatory behavior. Those algorithms are deciding employment, assigning risk scores to people who are arrested, nudging our perception of news, and determining which enemy combatants should be killed. So while we can all agree that computer science students no longer have the luxury to ignore the consequences of their work, I have to wonder, are we really equipping students to build ethical systems? I’m not so sure.

The problem with ethics in CS education is that our students end their discussions with papers and presentations, while our graduates must end them with algorithms. Neatly contained to an optional seminar or upper-level course, ethical thinking is valued less by our curriculum than documenting code, writing clever data structures, or even given a decent presentation. And while existing ethics courses allow us to practice debating ethical dilemmas, what we don’t get to practice is formalizing those values into code. Even worse, we tend to arrive at significant ethical discussion only after students have developed their problem-solving habits during core CS courses.

For students to develop necessary habits for ethical thinking, I have become increasingly convinced of three things:

  1. Ethics should be integrated into existing CS courses.
  2. Ethical thinking is a habit that needs to begin at the same moment our students begin developing their programming habits.
  3. Like every other topic we prioritize, students need to deliberately practice ethical design throughout their college career.

But I shouldn’t talk about this without putting my money where my mouth is, so the heart of this post is exploring what it might look like to construct an ethical design project for Intro to CS. To do that, we introduce a new programming project for CS 1 — The Ethical Engine —centered around an old ethical problem…

The Problem Space: MIT’s Moral Machine

Conceptualized by Iyad Rahwan, Jean-Francois Bonnefon, and Azim Shariff, the Moral Machine is an interactive website reimagining the classic trolley problem in the context of autonomous vehicles. If the trolley problem isn’t familiar to you, here’s the gist: a driverless car finds itself in a situation in which it must choose between “the lesser of two evils” — saving its own passengers in the car or the pedestrians in the crosswalk (see image).

A screenshot of one of the moral dilemmas posed at

Like any good dilemma, the answer is rarely straightforward. There are a number of variables at play which influence how we feel about the decision: the number of pedestrians or passengers, the lane the car is in, whether the pedestrians are crossing the street legally, and perhaps most interesting — personal characteristics that our omniscient car can somehow infer (weight, profession, gender, age, etc.).

While the trolley problem has its issues, I’ve found that the Moral Machine is a compelling example for my Intro to CS students — sparking excellent discussion about what algorithms should or should not do.

But we can also make these ideas less hypothetical and less confined to the 40-minute discussion. I wanted to force our students to translate hard decisions into code, I wanted them to see how those decisions can inadvertently lead to algorithmic bias, and I wanted all of it to be accessible in a student’s first semester of computer science.

Ethical Engine Part 1: Building a Decision Engine

Key Questions: What is a good algorithm? How do I turn values into code? Is it even appropriate to build my values into code?

The goal of Part 1 is for students to use python to code the “moral brain” of an automated vehicle —a program that automatically decides who lives and who dies in any Moral Machine dilemma. No human-the-loop. No indecisive “that’s a tough decision”. All algorithm.

To facilitate this, we provide students with a random scenario generator that automatically constructs moral dilemmas mirroring the Moral Machine. A scenario consists of 0–4 passengers in an automated car and 1–4 pedestrians in the crosswalk. Each passenger and pedestrian have set of attributes that dictate which group should (or shouldn’t) be saved.

Below are snapshots of Person and Scenario classes:

This allows students to rapidly generate random scenarios like the following with a simple line of code:

scenario = Scenario()
Three randomly moral dilemmas randomly generated by our python code. If you’re interested in building a more compelling graphical front-end please let me know!

The students’ task is to develop a decide(scenario) method that either outputs "pedestrians" or "passengers". The code must choose who to save for any scenario.

While this function can be written with simple structures that are accessible to first year CS students (lists, conditionals), it forces that critical translation from “what I believe” to “what the code does”— something we don’t have nearly enough of in introductory classes.

The template we give students for their decide function. They must write code that decides who to save for ANY given scenario.

At the conclusion of part 1, students’ values should be explicitly expressed in their code. We can compare and contrast the decisions the programs make, and reflect on the decisions they were forced to make.

Ethical Engine Part 2: Auditing the Engine

Key Questions: How can we learn to see algorithmic bias? What are the unintended consequences of the code we write?

Deciding to program ethically is one thing, but there is a second, (more?) important outcome: demonstrating how algorithms can accidentally be designed to be biased. These are the pivotal habits of mind that CS students must start developing from day 1.

In part 2 of the project, students are given several different python files which represent the decision engines of their classmates. The code is obfuscated so that it is not readable, and students must write a program to infer the built-in biases by simulating thousands of scenarios. This emerging practice of algorithm auditing allows us “to understand the algorithms that increasingly shape our life”. You’ll likely be hearing the term a lot more in the coming years.

Even if we can’t see the code, we can rapidly run thousands of scenarios to infer the ways in which an algorithm can be biased

As an example, I decided to create a decision engine that arbitrarily scored personal attributes — assigning a value for every person in the scenario (even hypothetically, I felt dirty doing it). I’ll show you those in a minute. But without looking at the code, let’s first see what my program inferred after running 50,000 random scenarios:

Most likely to be biased, based on % saved after 50,000 scenarios:
CEO: 72%
→ pregnant: 71% 
→ doctor: 68%
→ female: 68%
→ athletic: 68%
→ you: 59%
→ baby: 59%
→ adult: 57%
→ male: 41%
→ cat: 28%
→ dog: 28%

Okay, now the values that I actually programmed (in a python dictionary):

How I ‘scored’ different attributes. I feel dirty coding this even in a hypothetical context.

Carefully compare the percentages above with the image here.

The percentages don’t quite align with the scores that I gave different attributes. Take a moment to hypothesize why we see those differences.

Did you catch it? — pregnant people were given a value of 0in my code, but they were still heavily biased to survive. Why? It might seem strange at first glance, but when you consider it for a moment, this makes perfect sense. Women are highly valued (5) and they are the only people who can become pregnant. So even though we don’t explicitly value pregnancy, pregnant women are likely to be saved.

It seems simple, but this is a crucial observation.

No developer sits down and tries to program discriminatory algorithms. When we discover cases of algorithm bias, it’s often because no one carefully considered these secondary effects. This is why parole algorithms or housing algorithms can appear to exhibit racist behavior. The program is often valuing (or devaluing) some characteristic that correlates with race or gender or age… and no one was there to catch it.

Can we nudge students to similarly reflect on their own programming practices? By the conclusion of part 2, we hope that students are able to identify how their code — both explicitly and implicitly— will impact the lives of people. If we can encourage our future developers to pause for just a moment and consider the implications of their work, we will have succeeded.

Towards An Integrated Ethical CS Education

Discriminatory algorithms are often built by well-meaning people who are not trained to see the implications of their work. As computer science educators, we cannot afford to shove ethics to the side — relegating it to secondary status, or claiming that it “doesn’t fit in the curriculum”. We can do better. If we want students to positively impact the world, ethical thinking must become a priority.

Of course this project is just a prototype, and by itself, it isn’t enough. Learning takes deliberate practice. Just as we prioritize good code by enforcing style and design across multiple courses, there is no reason why we cannot prioritize ethical design thinking across the core curriculum— Data Structures, Algorithms, Software Engineering. We aren’t pausing the course or throwing away valuable content, we are situating the existing material into an applied problem-space in which students must make ethical decisions. I believe that building these habits of mind are critical to preparing our students for the emerging tech landscape.

This is a work in progress and we need your help developing the curriculum around it! If you are interested, feel free to either email me personally (, submit issues to our github repository, or modify the code yourself!

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.