Lady Justice holding a very simple algorithm — if x < y then smite_with_righteous_justice [Modified from]

On the Wrong Side of Algorithms

We shape our tools and thereafter our tools shape us — Marshall McLuhan

7 min readNov 29, 2016

#tldr Any algorithm can — and often will — reproduce the biases inherent in the data it’s using. One major problem is that explicitly removing features from a dataset does not eliminate an algorithm’s ability to learn them implicitly. One must be very rigorous to claim that an algorithm does not use something in its computations. The more we blindly rely on machines to do the learning for us, the more we’ll learn to predict but fail to understand.

You’ve heard the stories in the news. The algorithms we idolized, no longer the unwavering bastion of objectivity, have started reflecting a nasty vision of humanity: highlighting some subtle and not so subtle biases and discriminations, emergent from trying to simplify a complex world with data.

Algorithmic Fairness suggests an interesting exercise which I hereby recommend — try performing an image search for ‘person’ and look critically at the results. If we consider algorithms our brain children, when it comes to discrimination the apple doesn’t fall far from the tree.

Harvard researchers found that ads about arrest records were much more likely to appear alongside searches for names thought to belong to a black person.
Recidivism rates are drastically lower than classically reported, as a more appropriate statistical method debunks the widespread belief that most released offenders eventually return to prison.
According to a Carnegie Mellon study, Google is more likely to advertise executive-level salaried positions to users if it thinks the user is male.
Matching algorithms between medical resident and residency programs at hospitals can be tweaked to be more beneficial towards either applicants or program.

At the end of the day, any algorithm can — and often will — reproduce the biases inherent in the data it’s using. Hell, even the White House has issued a warning call. But in what specific ways can an algorithm be biased, and how can we take small steps towards fairer machine learning?

The Proof is in the Pudding

We often derail conversations on fairness in algorithms by debating the language and semantics of what we mean to discuss: equality, equity, justice, freedom, fairness… each of us has some notion of what they mean and what they should mean to others, which can make it difficult to achieve a consensus that successfully appeals to everyone’s moral and political preferences.

While we may have difficulty coming up with a clear definition, we feel the effects of inequality so distinctly when wronged by others or society at large that it begs the question: What is fairness?

When it comes to algorithms some formalism is required. For this deepdive let’s borrow the definition of fairness from Dwork et al., 2011 as the principle that “any two individuals who are similar with respect to a particular task should be classified similarly” and see if we can shed some light on the causes and consequences of blindly trusting our tools.

“But we don’t use that kind of information…”

As a society, we have decided that certain information about individuals should be protected; in the United States, the Civil Rights Act of 1964 prohibits discrimination on the basis of race, color, religion, sex, and national origin. In an age where the social conscience is highly sensitive to inequality, it especially doesn’t look good on you if you’re using protected categories (e.g. gender, race) as determining factors in your algorithm. Facing increasing tension, suspicion and accusations, companies will often categorically state they “don’t use that kind of information” in an effort to protect themselves and their image.

Most of the time this claim is b******t.

While it’s unlikely these companies are bold-faced lying to you, the core problem remains that explicitly removing these features from a dataset does not eliminate an algorithm’s ability to learn them implicitly (e.g. redundant encoding, omission — see Part 2). Protected categories often correlate strongly with unprotected ones:

Racial features often disproportionately correlate with crime. Without adequate sampling, it’s likely your data will implicitly encode that African names are more likely to be prosecuted and your new ads recommender will pick up on this feature.
Unexplained long periods of unemployment might train hiring algorithms to exclude applicants, regardless of their reason.
Algorithms may artificially promote certain individuals with equivalent skills based on educational pedigree (i.e. prestige of your alma matter).

Interested in these questions? Do you want machine learning to be fair ? Accountable to more than its masters ? And transparent for all to understand and interpret ? Consider getting in touch with Fairness, Accountability and Transparency in Machine Learning (FAT-ML) http://www.fatml.org/

Clearly, one must be very rigorous to claim that an algorithm does not use something in its computations. Like many other statistical problems, it boils down as an issue of interpretability of results. The challenge of drawing conclusions (within the limit of your experimental procedure) is well known within the Scientific circle. Poor experimental design, or flaws in the reasoning processes, will rapidly invalidate any results in the eyes of an independent perspective. In either cases, the burden of proof is assumed to be on the practitioner. But is this sufficient?

As with many other moral imperatives: if the overarching institution does not embrace high standards of ethics and morality, poor practices are bound to permeate down to the algorithm design. It’s not hard to imagine what happens to an overworked employees in a high pressure, strongly competitive corporate world (think Wells Fargo toxic sales culture). Without proper support and collective momentum, individual developers are left to juggle with yet another competing interest; one that often isn’t recognized or valued by those paying his/her salary.

Even when they are not designed with the intent of discriminating against certain groups, if they reproduce social preferences even in a completely rational way (e.g. incorrect sampling, biased dataset — see Part 2), they will also reproduce those forms of discrimination.

While many reading these words will feel disheartened, some insulted and others will seek justice in the streets (my respects to you) — my experience suggests that a large portion will briskly object: “The algorithm is just doing what it’s told!” and while some may despair at this apparent nonchalance, I find the intuition is spot-on.

Algorithms learn from data gathered by someone, using a logic designed by someone, optimizing a function chosen by someone: every stop on this supply chain beckons human error.

Moreover, the fast pace of technology enabled by sets of easy-to-use tools (e.g. scikit-learn, tensorflow) allows us to pivot and prototype at unprecedented velocity and with high sophistication. Very few people in ML bother nowadays with computing basic cross-correlation between their variables to gain even a low-level understanding of a dataset. The machine is supposed to do the learning, not us — but the increasing speed of computers also allows us to get to the wrong answers very fast.

We learn to predict but we fail to understand.

This problem is exacerbated by practitioners often exhibiting a limited understanding of statistics. Managers don’t understand algorithms. Discrimination law experts aren’t trained to audit algorithms. Engineers aren’t trained in social science. Even the world’s best computer scientists don’t know how to interpret the mechanism of many popular learning algorithms (read neural networks).

Advances in machine learning generally come from discovering innovative ways to constrain the space of possibilities the machine has to search, or in finding faster methods for searching it. They often take the form of heuristics and rules of thumb we struggle to explain at any level of rigor and yet perform surprisingly well. As heuristics and assumptions are stacked ever higher, rules become even more tangled and harder to interpret.

Note this isn’t just a problem that the layman isn’t listening to smart people, it’s that everyone is playing naive when it comes to realistically considering the ramification of machine learning.

Does it concern me?

You just made Aristotle facepalm himself.

If you’re working in marketing, employment, education, search, policy, criminal justice, banking, housing market, health, sales, advertising, marketing — Yes.

For all others — Yes.

At one point or another we’ll be forced to ask ourselves what level of moral standards we deem acceptable coming from the machines that assists us in our daily tasks. The question of determining which kinds of biases we don’t want to tolerate and how to enforce them is not a simple one and likely to fall within the realm of policy-making. It will require a lot of care and thinking, from everyone, about the many ways the technical systems we use, operate and impact civil life. The fact that these conversations seldomly occur between tech companies and governmental bodies is alarming.

Algorithm and data-driven products will always reflect the design choices of the humans who built them.

Assuming otherwise is irresponsible.

In Summary

Algorithms will reproduce the biases of its data.
Algorithms are gatekeepers to opportunity.
Statistics != Ethics.

If we’re to pursue and gain from Machine Learning as a society, crucial aspects of responsibility, transparency, auditability, incorruptibility, and predictability are destined to be intricately tied to how we do things in the future. “Criteria applied to humans performing social functions should be considered just as applicable in algorithms intended to replace human judgment” — Bostrom. Anything short of this risks facilitating and perpetuating a considerable number of our own vices into tools we so dearly want to believe will make everyone’s lives better.

Have your own definition of fairness? A personal story to share? By all means contribute to this conversation in the comments!

I would like to thank KellyAnn Kelso for her help bouncing ideas and providing multiple iterations of editing.

Stay tuned for Part 2…

This post was largely inspired by a talk given by Abe Gong I recently had the pleasure to attend. If you thought any of this was worth reading I would highly recommend you go have a look at the work of the following wonderful people (in no order whatsoever) Suresh Venkat, John Moeller, Carlos Scheidegger, Sorelle Friedler, Moritz Hardt, Cynthia Dwork, Deb Roy