Does this AI make me look fat?
Why is it so hard to create “normal” bodies with generative AI?
Like a lot of you I recently tried using AI to ditch some annoying tasks, in this case LinkedIn marketing for the company I freelance for.
Stock photography was letting me down, so I used DALL-E to request a photo of a woman jogger. Much has already been said about bias in AI, and to DALL-E’s credit, I got a mix of races to choose from in its output.
But every single jogger lady was cake pop — as in a body so skinny that the head riding on top looked like a cake ball on a stick.
I tried adding “curvy” but that was worse. DALL-E added some heft but — you guessed it — only in the chest.
“Overweight woman jogging” got me an unsafe content warning, and DALL-E refused to generate anything at all. (It seems images of larger women are unsafe but skinny ones aren’t.)
Of course, DALL-E is only remixing what it’s been fed. And that would be terabytes of unrealistic imagery surrounding the human body that has filled the Internet for the last 20 years. DALL-E thinks extremely thin is actually average. Is it any wonder that a teenager girl who has processed the same content her entire life comes to the same conclusion?
Generative images of men aren’t any better. Even UX master and AI champion Jakob Nielsen got a rather slim version of himself when he tried generating a character for an image of a fireside chat. He’s a cake pop in a suit and tie, according to Leonardo.
If AI is cranking out extremely thin content, will AI then feed on that thin content and further skew our body image toward skinniness? Will images of humans shrink year after year? Will we disappear completely?
How I tried to be inclusive
How hard would it be, I wondered, to deliberately and humanly remove body size bias from my AI outputs?
Pretty hard, as it turns out.
My first stab at fixing this was to be more specific in my prompt and also to try some other generative AI tools. I specified an exact weight and I tried Stable Diffusion via DreamStudio, Leonardo, and also the new Human Generator by Generated Photos.
Asking for a 150-pound woman jogger got me this from DALL-E:
Being about 150 pounds myself, I can assure you that 50 of those pounds must be below her knees and out of the frame because they aren’t visible. Or she’s 13 feet tall.
And this from Stable Diffusion, which wasn’t bad for a typical body size. (Stable Diffusion also gave me lots of unusable images with missing fingers and weird earphones or earbuds.)
Leonardo wasn’t bad. The jogger was slim but not excessively so. Requesting a 150-pound jogger made no difference. Leonardo was the most photorealistic, although the ear buds gave it trouble. (Why are skinny women so easy to create and ear buds so hard?)
How it got much much worse
The Human Generator looked to have potential. For one thing, you can control the body type, which ranged from very thin to overweight. But this is also where it got very specifically sad.
According to Human Generator, this is an average woman, which I think pretty much proves my thesis.
And here’s ‘very thin.’ And this is a serious problem looking us in the face. Why would any eCommerce or fashion website want this in their range of choices to replace a human model?
I requested a curvy woman in exercise attire, and got this, which was pretty darn normal looking, if I do say. (If you’re getting confused here, ‘average’ is really really skinny and ‘curvy’ is normal.)
You can also specify a pose, the majority of which leaned toward the provocative. I could not specify that my person should be jogging. Human Generator only offers catalog-style backgrounds. I also couldn’t add ear phones or ear buds, which is an element I needed to illustrate that the character was listening to an audio book.
If I could combine Human Generator’s body size control (dropping ‘very thin’ and ‘thin’ and renaming ‘average’) with DALL-E’s flexibility with backgrounds and settings, I’d have an image that might not shame half the world population. But such a tool doesn’t exist yet.
I ended up going back to DALL-E. The image quality was the best and the color tones worked well with my client’s branding. And it’s totes free through Microsoft. I’ll poke around Leonardo more in the future.
I finally just added “in winter” to my DALL-E prompt so that the final image had a young, attractive black woman running in a down jacket — covering up — if not solving — all body image issues.
Which is kind of what happens in real life, too.