We Are the Ones Making Algorithms Discriminatory: Why Are Minorities Invisible in the Digital World? (Turkey’s Examples)

Published in

Women in Technology

8 min readOct 20, 2023

*Image Source: Alina Constantin / Better Images of AI / Handmade A.I / CC-BY 4.0*

Did you know that algorithms can be “sexist, racist, discriminatory, and biased”

This debate has been ongoing, especially in Western countries, in recent times. Some research has shown that “black” women are not recognized in facial recognition systems on smartphones. This is because these models are built without considering black women, with data used to train them being predominantly from white individuals. As a result, these algorithms become ineffective for black individuals, and discrimination against them can be perpetuated by technology manufacturers.

In Turkey, the controversial tweets of HÜDAPAR (a political party that is in coalition with the current government), criticized for their “regressive discourse on women” during the election process, and those problematic tweets are directly censored by Twitter, are also among those claimed to have been “shadowed,” meaning that tweets containing such discriminatory discourse were not highlighted.

There are myriad algorithms in the digital world, and each has a different production process. Moreover, discrimination can be reproduced in these production processes. When we think of the algorithmic discrimination against minorities in Turkey, it becomes evident that the discrimination is primarily about “how discriminatory values in society feed these algorithms as a data source and how these data sources are used by creators of artificial intelligence throughout the process.”

I asked this issue to Sinem Görücü, who works in the field of social justice in artificial intelligence. Görücü, who conducts projects with various international institutions such as the Goethe Institute and the United Nations Women’s Unit and teaches at universities, approaches artificial intelligence and algorithms with a critical feminist perspective, emphasizing that algorithms can reflect societal discrimination like a mirror:

“Artificial intelligence processes what it takes from the world produces it, and returns it to the world. By ‘the world,’ we mean the data sets that are reflections of our current order and history, which these systems are trained on and created by humanity, so that’s the primary cause of discrimination. Human biases and worldviews also play a role in how this data and algorithm will be used throughout the process. In addition, how the artificial intelligence ecosystem operates, where the money in the artificial intelligence industry comes from and where it goes, whose interests are at stake, which models are produced with which motivations, in other words, what the capital driving this technology aims to do with this technology, can be another fundamental reason for discrimination in this field. In other words, yes, people are biased, discriminatory, and prejudiced, but we also create artificial intelligence. We provide the data, we write the algorithms, and we determine how decisions will be made based on which data.”

Gaps and Shortcomings in Data

According to Görücü, it is possible to categorize the discrimination resulting from data into a few headings.

One of these is the gaps and shortcomings in data. This means that some groups are not represented or are very poorly represented in the collected data. For example, in Turkey, 80% of Syrian and Roma women cannot read or write Turkish. As their access to technology decreases, the representation rates in many data sets and, consequently, in algorithms can be very low.
Another factor is the discrimination that arises when algorithms process user behavior and feedback as data. Developers need to be more sensitive in this regard because users continually convey their own biases and societal judgments to algorithms, which continue to learn from them.

Based on the examples provided by Görücü, we can evaluate the emergence of data-driven discrimination in Turkey as follows:

“For example, in Turkey, as hate speech targeting minority communities or LGBT individuals on social media platforms increases, algorithms may be accused of internalizing this hatred, normalizing it, and facilitating its proliferation. Because many social media algorithms learn what to highlight, what constitutes objectionable content, and which content will receive more interaction from different user groups based on the data provided by the user. The language used can be offensive and may vary from country to country and from language to language. Taking all these unique situations into account is important. However, often, this issue is not given as much importance as it should, especially in technology led by a handful of Western men in Silicon Valley. Therefore, it is important to consider what the algorithm has learned and amplified locally to understand how it works. Twitter can detect hate speech, yes, but it cannot detect everything. Why, for instance? When hate speech is expressed indirectly, sarcastically, with references to historical events, by establishing connections that the algorithm is not familiar with, or by mixing two languages, and when new words are coined in English or Turkish, the algorithm may not be able to detect them. Consequently, the algorithm does not filter and learn from such statements. This results in the normalization of this content in the eyes of the algorithm, rather than removing or flagging it as objectionable. We saw examples of this in various tweets related to Madımak, for instance. (Madımak is a massacre that happened in 1993 in Turkey against the Alevi people). Many contents that indirectly called for violence and normalized violence through metaphors or local references were not deemed objectionable and were not removed by the platforms.”

Even Public Transportation Systems Can Be Affected

“Data gaps or representation gaps” can also result from who has smartphones in their hands, making this an issue directly related to which social groups are represented in the data. Nowadays, location data collected from our cell phones can be used in planning transportation systems. These data can provide information on peak hours in certain areas, how much different groups travel within the city, and so on. Collecting and using such data is indeed an issue of data ethics, but care must also be taken in terms of data representation. Due to the economic crisis in Turkey and the increasing connection of some groups to digital devices and the internet, certain age and income groups, especially women participating in the labor force but also taking care of household duties, face increasing difficulties in access. This leads to a significant inequality in the representation of specific groups in location-based data sets. What risks can this pose? If you are someone who takes care of multiple children or an elderly person and does not have a smartphone, is not constantly connected to the internet, and you have a complex travel pattern, such as leaving home, dropping off one child, going to the market, picking up another child on the way back, delivering food to the elderly, and going to school, this pattern may not be included in the collected data. This is because location data collected from digital devices is assumed to represent the general population, and as long as it ignores minority individuals, local transportation systems will not be designed for these people. When designing linear transportation systems, these networks will go unnoticed, and individuals working in jobs like home care will not be taken into account in transportation networks.”

Stereotyped Descriptions

Another example of discrimination fueled by user feedback is what we call “productive artificial intelligence” models. Let’s take the example of Midjourney, a recently popular artificial intelligence system that turns text into images. You say, “Draw a beautiful woman,” and it presents you with four different options for what you described. Then, you choose the image that is closest to your imagination. Or, you can choose the closest one and say, “Create something similar to this.” Based on your preferences, it generates more and more options that you may like. For instance, when you say, “Draw a beautiful woman,” it looks at the data set it was trained on and sees that blond women are more often associated with the word “beautiful,” so it can generate more images of blond women who resemble the examples it saw. This is the most basic example. But when you ask it to draw a Roma woman, does artificial intelligence know anything beyond the stereotypes in the data set about Roma women? Does it know at all? Or does it have contextual knowledge in this regard? Will it assume that the Roma communities in Turkey are the same as those in North America?
I conducted some experiments on this issue at one point. When you say, “Draw a traditional Turkish woman,” it looks at the data set and generates women wearing Indian clothing, but the setting looks like Turkey. For example, there may be Turkish motifs on the clothing, but the women wear a shawl typical of Asia in the generated images… This is just one example, there are many such biases.”

Creating Complaints on Social Media Is Important

Görücü says that the stereotypes in data may also result from the lack of representation by users.

Due to the low access of Arab immigrant women in Turkey to technology, it may lead to the spread of stereotypical ideas about them. Görücü emphasizes:

“The feedback provided by Arab immigrant women, one of the largest minority groups in Turkey, to algorithms on Turkish social media may be low due to a lack of access to technology and language barriers. For example, there are common perceptions about Arab women in Turkey, and they may even internalize these perceptions themselves. Therefore, when they encounter such things on the internet, they may not have a tendency to report them to the platform. However, it is important to be aware that people are training these algorithms with their behaviors and feedback, even if they are not aware of it. Therefore, when we see something wrong but do not report it, that judgment continues to be visible, and subsequently, the algorithm continues to generate similar judgments without any issues. That’s why it’s important to be active on this issue at all times.”

Görücü mentioned an important example in recent times that took place on Twitter during the election period in Turkey, where Twitter may have applied a censorship policy that could significantly affect women.

According to Görücü, Twitter’s algorithm may have reduced the visibility of tweets related to HÜDAPAR (a conservative Kurdish independentist party that is in coalition with the current government) during this period. In the case of tweets related to HÜDAPAR, these tweets received significantly less visibility compared to other tweets by the same users. Another detail about this censorship is that while the party programs of all political parties could be shared as links on Twitter, HÜDAPAR’s party program could not be shared as a link in the system. In a sense, Twitter blocked this link.

Therefore, social media platforms can take discriminatory actions directly in their algorithms for political or economic reasons against minority or disadvantaged groups.

What Should Be Done?

The discriminatory aspects of algorithms can lead to societal problems, and experts believe that this issue should be made a subject of social debate.

Görücü also states that fighting against discriminatory artificial intelligence is a long and comprehensive process, but it can be achieved by increasing literacy in this field.

Author and translator: Asmin Ayçe İdil Kaya

Original Source: https://9koy.org/algoritmalari-ayrimci-yapan-biziz-azinliklar-neden-dijital-dunyada-yok-gibi.html

References:

Joy Buolamwini and Gebru Timnit. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification” You can access the research paper here: Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.
Teoman Akpınar. This source provides information on the literacy rates of Syrian refugee women in Turkey. You can access the research paper here: Literacy Rates of Syrian Women in the Context of Social Policy in Turkey.
Literacy Rates of Roma Women: Data Obtained from Surveys Conducted as Part of the Social Inclusion Support Operation (SIROMA) Project. This source provides data on the literacy rates of Roma women, collected through surveys conducted as part of the SIROMA Project. https://ekmekvegul.net/sectiklerimiz/gunun-rakami-roman-kadinlarda-okuma-yazma-orani-cok-dusuk
“Access to Technology for Roma Women: Hilal Tok.” This source discusses the access of Roma women to technology and their experiences during the COVID-19 pandemic. You can read the article here: Access to Technology for Roma Women: “I collect mallow to feed my child”.