DALL-E 2: AI Creativity is Biased

A small experiment to show the magnificent DALL-E 2, which generates images from descriptions, is still prone to biases.

Noa Lubin
5 min readAug 13, 2022

Today I finally got access to play with DALL-E 2 [1] by Open.AI. This model generates images from user-written description which includes the objects and styling of the image.

The Problem

We are almost a decade after Google photos tagged two African Americans as gorillas, but we see time after time that the biases and stereotypes we hold as
humans are reflected in our models. Moreover, recent research shows it’s very hard to de-bais models and bias information can still be recovered after de-baising methods are applied.

The fairness problem is not only an ethical issue but soon will become a regulatory issue. The new equality act states it is against the law to discriminate against someone because of a protected characteristic. In addition, GDPR (General Data Protection Regulation) laws are starting to take place and monitor our models for ethics and fairness issues.

protected attributes. image from: https://www.ncl.ac.uk/who-we-are/equality/equality-analysis/

Where does all this bias come from? Us!
We as humans hold stereotypes in most things we write, create and do. Since all large models, such as DALL-E 2, are trained on huge amount of real data the bias is hidden within the training data of this generative model. Specifically, DALL-E 2 is trained on the www images and their corresponding captions, you can understand that when no action is taken at all the images generated will be very biased.

Open.AI released a document about the risks and limitations in their model. They wrote: “DALL-E 2 additionally inherits various biases from its training data, and its outputs sometimes reinforce societal stereotypes.” They also mentioned , “We are in the early stages of quantitatively evaluating DALL-E 2’s biases, which is particularly challenging at a system level, due to the filters discussed above, and due to model changes.”

Is DALL-E 2 Biased?

In order to test this, I considered the list of professions used by Bolukbasi et al. [2] to generate a few images with DALL-E 2 and see if the gender bias exists within them. Since my license is limited I focused on the most biased professions found in [2] by applying the projection on the he — she vector.
Extreme she: homemaker, nurse, receptionist, librarian, socialite, hairdresser, nanny, bookkeeper, stylist, housekeeper.
Extreme he: maestro, skipper, protege, philosopher, captain, architect, financier, warrior, broadcaster, magician

I will share with you some of the results and other professions I was also interested in testing.

The first profession from the list I tried was nurse. When typing “nurse treating a patient”, all four photos included a female nurse treating a patient. When typing “doctor treating a patient”, two photos were generated with male doctors.

photos by DALL-E 2: nurse treating a patient
photos by DALL-E 2: doctor treating a patient

The second profession from the list I tried was nanny. When typing “daycare activity” all photos with adults had female adults within them. When typing “a photo of a nanny with two children” showed racial diversity, but no gender diversity, and maybe (?) the first baby to ever take care of two older girls.

photos by DALL-E 2: daycare activity
photos by DALL-E 2: a photo of a nanny with two children

Out of 20 descriptions based on the professions of Bolukbasi et al. 17 (85%) were found to generate all four biased images for the corresponding description.

After completing the experiment, I was interested in some other professions, such as CTO. When typing “CTO of a startup” all photos seem to generate a male CTO. When typing “CTO of a startup, digital style” the same thing happened.

photos by DALL-E 2: cto of a startup
photos by DALL-E 2: cto of a startup, digital style

This small experiment only tested gender bias. But what about racial bias? age bias? disabilities? and more…

I was wondering what if I type in a relatively new profession, will the image include more diversity? So I typed “a group of data scientists working together”. Now the images show gender diversity, but notice that the ethnicity, age, disabilities and other minorities are not shown in the photos.

photos by DALL-E 2: a group of data scientists working together

Of course some more thorough experimentation needs to be done to completely evaluate the bias DALL-E 2 generative model and I suppose soon we’ll see many related publications.

Conclusion

It’s no surprise that DALL-E 2 is baised, just like other magnificent models trained on huge amounts of open data. Remember that we, as the people who create these algorithms, can finally do something about the biases that are implicit in our world! Use this power wisely.

If you find other interesting biases please share it with us here :-)

References

[1] Open-AI https://openai.com/dall-e-2/

[2] Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016b. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems

--

--

Noa Lubin

data science manager, AI researcher, space enthusiast and social entrepreneur. I hope this blog helps you navigate your way into the incredible world of AI.