Cleaning up a baby peacock sullied by a non-information spill

Emily M. Bender
4 min readAug 6, 2023

--

Last week, I wrote:

When OpenAI set up the easy interface to ChatGPT, when Meta briefly provided an interface to Galactica (misleadingly billed as way to access scientific knowledge), when Microsoft and Google incorporated chatbots into their search interfaces, they created the equivalent of an oil spill into our information ecosystem.

And I’ve said in the past that the token efforts they make to clean up after themselves are like Exxon Mobil or BP bragging about how many poor birds they cleaned off.

Yesterday, I encountered a bird sullied by the current non-information spill. Specifically, while scrolling Facebook on my phone, I spotted this very appealing post on Facebook:

The photo in the screenshot is of an adorable baby bird, with mostly brown features but some vivid blue ones that look like tiny versions of grown peacock tail feathers.
Screenshot of Facebook post from NATIONAL GEOGRAPHIC. I added the watermark (“Synthetic image Not real”) and redacted the name of the friend of mine who’d reacted to it

“Adorable!” I thought, and clicked share. Two friends of mine very quickly pointed out that this is a synthetic (generated) image. (This is quite plain when looking at the screencap above on my laptop; it was less so when scrolling past it on my phone.) I thanked them and removed my share of the post.

I then wanted to find out what baby peacocks really look like. Here’s a screencap of the current Google Image search results:

Screencap of Google image search results for the search “baby peacock”, containing a grid of 14 images, all apparently photographs, all of baby birds, some brown and some with the vivid colors associated with peacocks.
Google Image search results captures on August 5, 2023. The first result is the same fake image. Several other images look similarly fake.

Clicking through on the Google Image search results, I see that the sites the images are from include: Twitter, Birdfact, The Hip Chick, BackYard Chickens, Dreamstime, Reddit, Wikimedia Commons, Alamy, Pinterest, Etsy, Reddit, Adobe Stock, Shutterstock, Prompt Hunt, Mental Floss, Poultry Care Sunday, iStock, YouTube, AZ Animals, Flickr, and Freepik.

Some of those are obviously not going to help me find out what baby peacocks really look like: Prompt Hunt and Freepik are obviously trafficking in synthetic media (the label for the Freepik image is “Premium AI Image | Baby Peacock”). The stock photo sites are not confidence inspiring in this case — I don’t trust them as sources of information about birds.

Birdfact is not a site that I’ve heard of, but advertises itself as a site that helps people identify birds, stating “Our mission is to become the world’s most comprehensive resource about birds around the globe.” The Birdfact article the Google Image result links to in fact discusses the fake image that started me down this path, lending credibility to their presentation.

Wikimedia Commons is a site where I would expect good-faith information, but of course anyone could have posted something inaccurate there. On the other hand, the photo has metadata showing it was uploaded in 2018, well before the non-information spill started. Here is the Wikimedia Commons photo:

Photo of a baby peacock (peachick), all of whose feathers are different shades of brown.
Photo of peachick, CC BY 2.0 Rolf Dietrich Brecher, source: https://commons.wikimedia.org/wiki/File:Baby_Peacock_%2818131813108%29.jpg

Returning to the points I made in my previous blog post:

The reason I make the analogy to oil spills is that this isn’t just about the harms to the person who initially receives the information. There are systemic risks as well: the more polluted our information ecosystem becomes with synthetic text, the harder it will be to find trustworthy sources of information and the harder it will be to trust them when we’ve found them. Rich Felker makes this point well over on Mastodon in a thread on the importance of provenance, without which information is just words.

Even though this particular question (what does a peachick really look like?) is far from life-critical, in trying to track it down, I’ve experienced everything listed in that paragraph. Going through the Google Image search, I had to weed out several unreliable (and prominently returned) sources before I found ones that might be trustworthy, but then found myself asking: can I trust Wikimedia Commons on this case? Is the photo there old enough to be trustworthy?

In general, some level of skepticism of sources is valuable. Clearly, we shouldn’t believe everything we see on the Internet! But at the same time, we need to be able to build up trust in trustworthy sources over time, lest these exercises in fact-checking become too costly to sustain.

In the grand scheme of things, this particular experience (regarding peachicks) is just a minor inconvenience. But at the same time, I found it to be a vivid example of the effects of the synthetic media spill — and it’s frightening to think about similar experiences accumulating for many people across many topics.

To overcome this, we’ll need both individual-level efforts at information hygiene (I should not have shared the photo in the first place, and it’s good my friends alerted me to it) and communal efforts to bring about regulation: Synthetic media should be required to be watermarked at the source.

Post script: It doesn’t help that the original misinformation appeared to have entered my Facebook feed via National Geographic, a source I would be inclined to trust on things to do with the natural world. Had National Geographic fallen for the hoax? In fact, no. The Facebook post that is circulating comes from a group called NATIONAL GEOGRAPHIC, with this info in its About panel:

Screencap reading: About it’s group for people who watch national geographic wild,and sharing somethings related to nat geo wild Public Anyone can see who’s in the group and what they post. Visible Anyone can find this group.
About panel from page for the group NATIONAL GEOGRAPHIC, screencap from August 5, 2023

--

--

Emily M. Bender

Professor, Linguistics, University of Washington// Faculty Director, Professional MS Program in Computational Linguistics (CLMS) faculty.washington.edu/ebender