How Data Can Drive Inequality

Published in

CARRE4

7 min readFeb 8, 2021

Do you know how AI sees you? Do you know if it discriminates against you?

Algorithms are being used as diagnosing tools in healthcare, systems for identifying potential combinations for the creation of medicines, vaccines and treatments, they assess who is most likely to commit crimes, vote conservative, get an A in their exam, or buy a new TV. The data and solutions they provide us appear to be a sophisticated intervention for driving change and making the world a safer, richer and healthier place. However, as Morozov notes in his award winning book, this tendency towards technological solutionism neglects the imperfections made by design and the consequences of these digital technologies.

Many algorithms are built as a tool for overcoming human bias in areas often criticised as being vulnerable to inequalities, such as racial discrimination in the justice system and racial, gender and social discrimination in recruitment.

However, artificial intelligence is trained on human data, and a lot of the time this data does not include a diverse range of individuals.

“The AI issues of bias can worsen if it is not fuelled with quality data.” says Márcio Burity, Diplomat and panelist at our upcoming AI Summit.

Workplace and Recruitment

In her book, Invisible Women, Criado-Perez highlights the ways that recruitment algorithms in tech companies discriminate against women. Data collected on what makes a good programmer has indicated that a good applicant should have a strong online presence with their community on sites like GitHub and Stack Overflow. In addition, it was found that those visiting Manga sites were associated as being good programmers. The trouble with this is, as Criado-Perez identifies, 75% of the world’s unpaid labour (cleaning, care work, grocery shopping, domestic duties) women have on average 30 minutes of free time per day to men’s 6 hours. It is therefore much less likely that female candidates for programming positions will have had an equal amount of time to invest in building an online presence as the male candidates. Thus, females are deemed less employable by the algorithm. By doing detailed research into the potential gender differences between programmers, designers of AI recruitment systems could better programme a system that takes more neutral factors as indicators for applicant suitability.

Another way that women are being disadvantaged by data systems that do not see the whole picture is through their place in Academia. Women up against bias in publishing their work, with women twice as likely to get published under double-blind reviews, and are less likely to get publishing work done as they are loaded with more work than men, for instance receiving more requests from students for support. They are also less likely to get tenure. Data identified this gap, finding that 70% of men with tenure are married with children, compared to just 44% of women, concluding that as women were likely to take time off for maternity during their 7 years to get tenure, their chances of academic success in comparison to their male counterparts was lower. In seeing this, a policy was created that gave academics (female AND male) an extra year for achieving tenure for every child they have. The trouble is, the time constraints of pregnancy are not gender neutral, men are not dealing with morning sickness and breast feeding, for instance. This resulted in an increase in men receiving tenure and a decrease in women receiving it. It is important when creating policy from data to ensure it is not being gender-blind and reinforcing inequality. Policy makers and data analysts alike must engage critically with the debate about those marginalised in their systems before taking action.

Healthcare

As documented by Dusenbery, many healthcare treatments, medicines and scientific knowledge about disease has resulted from studies and trials only ever tested on white men, including treatments for diseases specific to the female body, like ovarian cancer. This is because testing on women does not create ‘clean’ data; the influence of hormonal cycles is deemed too much trouble for scientific study as it messes up result, and in doing so has left a massive gap in scientific knowledge about how women experience symptoms differently from men. For instance, women have died from heart attacks as being dismissed as just having anxiety, because heart attack symptoms are often much less severe for women than men yet just as devastating. Additionally, many medicines have resulted in thousands of birth abnormalities because the impact on pregnancy was not investigated. Currently, treatments do not need to have been tested on women in order to be approved.

Photo by National Cancer Institute on Unsplash

As AI tools are starting to be implemented into healthcare settings to assist doctors in diagnosis and offering options for treatments, it is apparent that they are working on data that will further reinforce the inequalities faced by those not part of the ‘status quo’ (white males) of scientific trials. It is likely that they will not be able to identify gender specific symptoms and therefore continue to leave women at risk of the wrong treatments, or no treatment at all. Until the scientific knowledge that AI systems are trained on represents everyone equally, the scientific advancements that are predicted by these systems will fall short of their aim to improve science.

Humayun Qureshi, co-founder of Digital Bucket Company, says “Working through critical problems and solutions within AI systems is at the heart of what we do here. Training our teams to think ethically about their designs, no matter their specialism, is paramount to leaving the world a better place. If something we built creates bias, we won’t release it until it’s fixed, regardless of the deadline. Ethics shouldn’t be sidelined to one department, it should be the focus for every employee.”

Online dating

You may think that the dating algorithm that shuffles your choice of matches offers you options to swipe through at random. You may even know that the algorithm in fact weighs up each person’s ‘like-ability’ scores and shares around their profiles more if they have higher scores. But you may not be aware of how good the algorithm is at reproducing racial bias. While some apps and sites like Grindr and OKCupid have ethnicity filters that lead to users feeling discriminated against and exploited for their ‘exoticness’, others work without explicit filters but nevertheless marginalise and exploit racial minorities, even when their designers insist that their algorithm doesn’t work that way. The algorithm may not be designed to use ethnicity filters, but it is also not designed to challenge the prejudice it is instead reflecting, therefore removing certain people from the visibility of the system more than others. For instance, black men and women are less likely to be contacted by white users than the other way round, because white users are far more visible.

This is a reproduction of discriminatory beauty ideals that are being trained into AI systems. An AI in 2016 was trained on images of women in order to judge a beauty contest: after over 6000 applications were made, from over 100 countries, the AI beauty judge selected winners that were almost all white. Data represents our values, so if those values are that beautiful faces are white, then this is what our algorithms will learn. This is the same thing that is happening on dating apps: algorithms that share by popularity are reinforcing racial prejudice by learning to prefer white faces and marginalise other ethnicities.

Where should the line be drawn between preference and prejudice? Apps should try to mitigate discrimination, because if the design lets them be prejudiced, they will be. No regulation or institution has responded to these issues, it is currently something self-led by the dating app companies and their feedback from the community, as well as pressure from researchers. Some solutions that researchers have suggested is to replace manual race filters with other filters for connection, steering away from racial filtering, while another alternative is to not match people based on algorithmically ‘inferred’ taste, but with more weight on their filtered preferences. This would mitigate how race-based beauty ideals are reflected by the algorithm. It is the role of the designers to examine their work to ensure it is not reflecting historical patterns of discrimination. However, there is absence of both advice and governance from legal and government authorities, therefore the designers are not necessarily in a supported or motivated position to do anything about these issues.

This blog has aimed to highlight the importance of taking a deep and critical analysis of the implications of data on inequality in society. We are still far from finding fool-proof solutions to the biases in data-based systems, but starting a conversation that raises awareness within policy-making, data design and the general public is a crucial first step to overcoming the reproduction of inequality.

The following blogs will continue the discussion about the critical concerns in AI today, including how algorithms can erode democracy through political polarisation, behavioural conditioning and information silos, and then how we could govern AI, whether ethical standards can be applied globally or if we need different regulations for diverse cultures.

This blog is written by Lauren for Digital Bucket Company, who are hosting their full-day AI Summit on 30th March 2021 with leaders in Tech and Government around the world joining to discuss the key issues in AI: Bias and discrimination, Privacy and Governance, Data Ethics, Women in Tech and the Future of AI. Stay up-to-date with ticket launch on their page.

How Data Can Drive Inequality

Written by Lauren Toulson