Collaborative Data Mixtape: Algorithmic Injustice [Part 1]

Abdo Hassan
digitalsocietyschool
11 min readMay 2, 2019

[This work is part of a literature-based mini-series on algorithmic injustice. In this part, the concern is to diagnose the subjectivities inherent in algorithms, and to trace back the theoretical framework of bias in data-driven systems]

Intro: Defining the Mixtape

In his book ‘ I mix what I like, A Mixtape Manifesto’, Jared Ball re-emphasizes on the use of the mixtape as a decolonial practice. He looks at mixtapes as parallel to the movement of Emancipatory Journalism; a bottom-up approach to journalism which bypasses elite reporting, and institutionalized access, in favor of a more grassroots approach to disseminating news. Likewise, he synthesizes that mixtapes began in the 70s as the DJs’ way of distributing their music “without sanction from a mainstream corporate industry, allow[ing] for the kinds of communication ultimately threatening to power”.

In the same spirit, I would like to choose the format of the mixtape as a way of disseminating the literature on algorithmic injustice. Mixtapes are known to disturb the coercive rhythm of a centralized narrative. Not only do mixtapes embrace hybridity of sources, bypass traditional modes of distribution, but they also allow for new collaborative modes of control and contribution. The current tracks of this theoretical mixtape are only a start, enriched only by a collaborative contribution not frozen in time.

This mixtape, in particular, seeks to question why algorithms are producing injustice rather than survey the forms of algorithmically produced injustice.

♫ ♪ Track A: Algorithms are Everywhere

This mixtape of algorithmic injustice involves multidisciplinary literature from across the spectrum; Law, Anthropology, Sociology, New-media Studies, Science and Technology. The ubiquity of algorithms as traffic officers of data means that their influence is over-reaching. In order to create this mixtape, I seek to survey algorithms from a variety of angles; their nature, the types of bias surrounding them, their mathematical prowess, implications on governance and their relation to free will. In his paper, Algorithmic Culture, Ted Stephas tries to characterize this ubiquity as he traces the computerization of cultural work and the reliance on algorithms in the past few decades as a cultural phenomenon. The omnipresence of algorithms and the data-driven paradigm has brought with it multiple transformations; in Health, Security, Finance, Commerce, Governance, and Supply-Chain. However, the ubiquity of algorithmic culture has also contributed to the phenomena of the ‘black box’, as illustrated in Frank Pasquale’s book, Black Box Society. In the book, Pasquale surveys a history of legal disputes, case studies, leaks and interworkings of silicon-valley software to illustrates the bias inherent in automated decision-making systems.

♪♫ Track B: Bias. On types of bias within data-driven systems.

Further investigation of algorithmic inequality leads us to distinguish between different notions of biases. As a term, ‘bias’ is widely used as a signifier for different phenomena across disciplines. The task here is to build a diversified understanding of computerized bias. In computer and data science, statistical, machine learning, measurement, and sampling bias are all a natural part of the problem-solving process. Accounts such as Dietterich and Kung’s paper, Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms adopt bias as essential in “order to generalize beyond the training data”. Such accounts recognize the natural role of bias in constituting inductive reasoning; In order for a machine to make useful generalizations about speech or object recognition, for example, it must be biased towards different object properties and different sounds. The same kind of bias argues Tom Griffiths, makes us impervious to optical illusions and allows for a human mastery of ambiguous spoken languages.

It's not only that data science is aware of the usefulness of bias, but it also views bias as one side of an optimization problem. The machine has two routes to take. The first is to make rigid generalizations with high bias. If a thing has x then it is y. If an object has four legs then it is a chair. The second is to account for all the corner cases and avoid rigid generalizations. If a thing has x1, x2, x3 with high confidence, then it is y. If an object has multiple legs, a back rest, made out of wood, and is of human proportions then only then is it a chair. In the first route, much more objects will be classified as a chair, but the classification will be underfitted. This means alot of objects which arent chairs will be classified as chairs simply cause they satisfy the bias of having four legs. In the second route, much more features are taken into account, but the classification is overfitted. Much less chairs will be misclassified as non-chairs, since perhaps they don't satisfy the category of having a backrest, or being made out of plastic. The tradeoff here is that in the attempt to fight of classification bias, the system loses its ability to make useful generalizations. The best way is to find a balance where the system is neither overfitted or underfitted. This dilemma is known as the bias-variance tradeoff. You can find an interactive, indepth explanation of it here.

However, theres often other types of bias-based dilemmas which arent directly addressed by machine learning systems. This is since these systems are embedded into a technosocial superstructure, a triad of societies, coders and systems. In their paper Bias in Computer Systems, Friedman et Al detailed the impending social biases inherent in computing as early as in 1996. Using online flight reservation systems as benchmarks, three types of bias were detailed: Pre-existing bias, Technical bias, and Emergent Bias. The pre-existing bias relates to how already existing social bias manifests itself in the attitudes of the coder, engineers and technical staff creating, deploying and maintaining the algorithmic systems. Meanwhile, technical bias details different biases inherent in computer systems, such as the inexistence of truly random mechanisms, hardware restrictions, and normalization issues. Lastly, emergent bias is a bias which emerges with use. This includes phenomenon such as the renewal of human knowledge and the mismatch between the population using the system and the population for which the system was designed.

Track B leads us to ask questions such as How can we account for different types of biases in our designs? How do we avoid the reproduction of pre-existing bias? How do we use algorithms in a manner that accounts for emergent bias?

♪♫ Track C: Critical Alert, Algorithms are Hard to Study

A discussion of bias and inequality in relation to algorithms must stem from both a critical and an informed look at the notion of bias itself. In his paper Thinking Critically About Researching Algorithms, Rob Kitching discusses different epistemological and methodological challenges when studying algorithms. Kitching sees not only the black-boxing of algorithms as a catalyst for injustice but also their heterogeneity, distributed, ontogenetic, performative and contingent nature as obstacles for those critically engaged in algorithmic design. Algorithms are fluid, constantly being patched and updated; algorithmic work often proves to be emergent and reflexive. For example, Facebook’s famous EdgeRank algorithm adapts itself to each user and molds itself based on the interactions of surrounding a single-node.

Track C leads us to ask questions such as What should we be mindful of when studying algorithms? What constitutes an algorithm? What are the different evolutionary phases of an algorithm?

♪♫Track D: Dawn of Algorithmic governance. Can algorithms govern?

President Richard Milhous Nixon, Futurama

An algorithm’s ubiquity and concealment, paired with its ability to make decisions in decisive domains gave rise to a new wave of critical literature. The increased space in technosocial systems for automated and algorithmic decision making has raised multiple questions surrounded surveillance, justice, and accountability relating to the decisions. In his book Code 2.0, Lawerence Lessig describes a new cyber-regime where ‘code is law’. He positions code as a power within itself, a legislative object rather than a legislative subject. In this context, a code which is used to prescribe the actions of a self-driving car is a legal text on its own accord and must be treated as such. Meanwhile, in Protocol Politics: The Globalization of Internet Governance, Laura DeNardis examines a different piece of technosocial structures: internet protocols. DeNardis examines the technical protocols governing the flow of information online, using them as vehicles for discussion on the role of government and politics on information flows.

However, controversies surrounding algorithmic governance can be discussed with regards to specific technologies. Lucas Introna and Davis Woord, in their essay “Picturing Algorithmic Surveillance”, give an account of the biases embedded within facial recognition algorithms, as well as the political implication of the automated decision-making process. Meanwhile, in “Defining the Web: The Politics of Search Engines”, Introna Et Al define and characterize a regime of systematic inclusion/exclusion orchestrated by search engines. Since search engines only index a negligible fraction of the web; The paper exposes these indexing mechanisms, giving an account of the developers, designers, and maintainers of the engine’s exclusionary regime.

Track D leads us to ask questions such as Who do the algorithms govern? Who governs the algorithms? Who is held accountable for the governing of data? How can we regulate the algorithmic regulation of data?

♪♫ Track E: Even Math? If Algorithms are math, shouldn’t we trust them‽

Algorithms are mathematical abstractions of a process. Since mathematics is often linked to a paradigm of objectivity, the conversation around them is often detached from political and social thought. However, multiple accounts have critiqued this detachment, claiming that mathematics has been proven subjective. The political and biased nature of mathematics has been discussed from multiple angles.

One angle deals with the mathematical implication in formalizing relationships between things, objects, people and phenomenon. The result of these formalizations is ontologies or algebras describing the social phenomenon. In the essay ‘Bastard Algebra’, Nick Seaver describes the result of these data-driven relational ontologies as bastard children of both mathematics and social sciences, and that they’re inherited burdens neither fields would like to claim. Seaver also makes connections with the mathematical understandings of kinship, and how some algebraic imaginations are inherited, rather than objectively synthesized.

Another angle of the politicization of mathematics addresses specifically the mechanics of machine learning models as they’re used to reinforce inequality en masse. In her book, Weapons of Math Destruction, Cathy O’Neal argues that the opacity, persuasiveness, and lack of regulations of machine learning models make them ideal candidates of perpetrating injustice. She used examples from insurance, advertising, education, and law enforcement to see how the poor and disenfranchised are systematically affected by machine learning systems. On the same note, The Politics of Large Numbers describes a history of appropriating statistical reasoning to fit national agendas, to mislead rather than inform.

Track E leads us to ask questions such as If math is inevitably instrumentalized, how can we instrumentalize it for social good? How do we regulate the use of machine learning models? What are the possibilities of forming non-bastard algebras and ontologies of things?

Track F: Freewill. Are Algorithms the antithesis of free will?

Psychopolitics: Neoliberalism and New Technologies of Power by Byung-Chul Hans has a more neoliberal critique of Algorithmic injustice. The book, amongst other things, gives a contemporary view on the neoliberal illusion of ‘freedom’ as mediated by data-driven algorithms. The book uses the ubiquity of algorithms to further its points about the neoliberal modes of self-enslavement and self-subjection. Hans sees data-driven technology not as an object but as a channel for distributing hierarchies of power. The algorithm has an ability to ‘predict’ the fates of its subjects, subsequently caging them. It reduces the user’s intentions to predictions, going against the basic notions of psychological ‘free will’.

Track F leads us to ask questions such as How do algorithmic and data-driven systems affect the agency of its subjects? How do we expose the hierarchies of power implicit in the interworkings of algorithms? How do we uncage the algorithmic subjects?

Outro: Defining the Mixtape

This loose compilation of literature and questions about algorithmic injustice resemble the start of multiple threads from which multiple discussions should stem. It tries to launch an investigation on the inevitability of bias and the different lenses under which we can understand why algorithmic injustice has become a reality. The next step would be to survey not only the forms of in which these injustices manifest themselves but also ways to unwork them.

Bibliography

Introna, Lucas D., and David Wood. 2004. ‘‘Picturing Algorithmic Surveillance: The Politics of Facial Recognition Systems.’’ Surveillance & Society 2 (2/3): 177–98 http://surveillance-and-society.org/articles2(2)/algorithmic.pdf

Introna, Lucas and Nissenbaum, Helen. 2000. “Defining the Web: The Politics of Search Engines,” IEEE Computer, 54–62. http://www.nyu.edu/projects/nissenbaum/papers/Defining%20the%20Web.pdf

Seaver, Nick. 2015. “Bastard algebra.” In Tom Boelstorff and Bill Maurer, eds., Data, Now Bigger and Better! Chicago: Prickly Paradigm Press. https://socialmediacollective.files.wordpress.com/2015/11/4114c-seaver-bastardalgebra.pdf

Reigeluth, Tyler. 2014. Why data is not enough: Digital traces as control of self and self-control, Surveillance & Society 12(2): 243–254. http://library.queensu.ca/ojs/index.php/surveillance-and-society/article/view/enough

Dietterich, T. G. & Kong, E. B. (1995). Machine learning bias, statistical bias, and statistical variance of decision tree algorithms.Technical Report, Department of Computer Science, Oregon State University, Corvallis, Oregon. Available from ftp://ftp.cs.orst.edu/pub/tgd/papers/tr-bias.ps.gz.

Ball, Jared A. I Mix What I like!: a Mixtape Manifesto. AK Press, 2011.

DeNardis, Laura. Protocol Politics: the Globalization of Internet Governance. The MIT Press, 2014.

Desrosières Alain. The Politics of Large Numbers: a History of Statistical Reasoning. Harvard University Press, 2011.

Dietterich, T. G. & Kong, E. B. (1995). Machine learning bias, statistical bias, and statistical variance of decision tree algorithms.Technical Report, Department of Computer Science, Oregon State University, Corvallis, Oregon. Available from ftp://ftp.cs.orst.edu/pub/tgd/papers/tr-bias.ps.gz.

Friedman, Batya, and Helen Nissenbaum. “Bias in Computer Systems.” Computer Ethics, 2017, pp. 215–232., doi:10.4324/9781315259697–23.

Han, Byung-Chul, and Erik Butler. Psychopolitics: Neoliberalism and New Technologies of Power. Verso, 2017.

Introna, Lucas D., and David Wood. 2004. ‘‘Picturing Algorithmic Surveillance: The Politics of Facial Recognition Systems.’’ Surveillance & Society 2 ( 2/3): 177–98 http://surveillance-and-society.org/articles2(2)/algorithmic.pdf

Introna, Lucas and Nissenbaum, Helen. 2000. “Defining the Web: The Politics of Search Engines,” IEEE Computer, 54- 62. http://www.nyu.edu/projects/nissenbaum/papers/Defining%20the%20Web.pdf

Kitchin, Rob. “Thinking Critically About and Researching Algorithms.” SSRN Electronic Journal, 2014, doi:10.2139/ssrn.2515786.

Lessig, Lawrence. Code: Version 2.0. SoHo Books, 2010.

ONeil, Cathy. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Penguin Books, 2018.

Pasquale, Frank. The Black Box Society: the Secret Algorithms That Control Money and Information. Harvard University Press, 2016.

Seaver, Nick. 2015. “Bastard algebra.” In Tom Boelstorff and Bill Maurer, eds., Data, Now Bigger and Better! Chicago: Prickly Paradigm Press. https://socialmediacollective.files.wordpress.com/2015/11/4114c-seaver-bastardalgebra.pdf

Striphas, Ted. “Algorithmic Culture.” European Journal of Cultural Studies, vol. 18, no. 4–5, 2015, pp. 395–412., doi:10.1177/1367549415577392.

--

--

Abdo Hassan
digitalsocietyschool

I live on the intersection between software, critical theory, data, and poetry.