Beyond Access: Facebook’s Automated Image Descriptions and Disability Justice
Two weeks ago, Facebook launched Automatic Alternative Text, or AAT, a tool that provides automated image descriptions for blind users.
On the day the feature was released, the headlines were sensational, but Facebook’s accessibility team freely admits that AAT is in its infancy. The artificial intelligence algorithm it uses knows fewer than a hundred objects, and it only provides a description when it is 80 to 90% sure that it is correct. I am a blind Facebook user, and examples of image descriptions I have received so far include “Image may contain indoor,” “image may contain one person smiling,” and “image may contain hat.” As you can imagine, I am still just as unable to comment meaningfully on my friends’ photos as I was before this tool was released. Given this indisputable reality, why are most blind users so excited about this feature? And could examining the reasons behind this excitement reveal how image descriptions on Facebook are more than an access issue?
Matt King is a blind software engineer at Facebook who helped spearhead the development of AAT. In a comment on AppleVis, a popular forum for blind computer and iPhone users, he states that Facebook decided to go live with AAT now, as opposed to waiting until it became more sophisticated, based on data from blind users. Facebook’s research seemed to indicate that these users would find AAT valuable, even though the descriptions it generates are rudimentary. This finding is disconcerting, but not surprising.
In a 2011 blog post, Chris Hofstader, an assistive technology expert, bemoaned the fact that Google was being lauded by blind users for releasing a screenreader for Android, even though this screenreader did not include Web browser accessibility. Hofstader said that the release of this screenreader would be analogous to General motors deciding to stop creating innovative cars and instead choosing to release new vehicles based on Henry Ford’s Model T. He coined the term Model T Syndrome to refer to blind people’s effusive displays of gratitude when “a multi-billion dollar company does anything that may even be of marginal value to our community.”
Many blind users’ reactions to AAT are a perfect example of Model T syndrome. Rather than criticizing Facebook for providing descriptions that are stilted at best and unhelpful at worst, most blind people I know are grateful that Facebook thought of us at all. The fact that many of us are so grateful merely for being noticed speaks to the pervasiveness of ableism in our society, and how easy it is to internalize oppressive narratives: that we are not worthy, that we must accept what we are given, that we must not complain, that shoddy accessibility is better than no accessibility. Many of us probably internalize these narratives because we fear being accused of biting the hand that feeds us.
To understand how internalized ableism is at play here, it is helpful to unpack King’s explanation of why Facebook chose to go the AI route for generating image descriptions. According to King, Facebook considered several solutions for making images more accessible to the blind, but ultimately chose the AI approach because they didn’t want to “add a lot of friction.” King explains, “We could probably require people when they upload a photo: ‘please describe this for blind people.’ It would drive people nuts — that would never work at scale.” King’s use of the word “friction” is particularly telling. What he seems to be saying here is that while Facebook recognizes that blind people should have access to information about images, that access should not inconvenience sighted users. Rather than questioning the assumption that providing image descriptions is a burden and that blind people’s access needs are blind people’s problem, Facebook is reinforcing the ableist status quo.
Mia Mingus, a disability justice activist, says that providing accessible bathrooms and wheelchair ramps is not enough. In order to create a truly just world, we must challenge what she calls the myth of independence. We should instead view access as “collective and interdependent.” In other words, creating an accessible world is everyone’s responsibility.
As it is currently implemented, Facebook’s automated image description tool promotes independence, rather than interdependence. It sends the message, loud and clear, “Don’t bother writing a description of your new baby. Our AI has it covered.” In ten or twenty years, that might be the case, but not now. . With existing technology, the only way to ensure full and meaningful access to images is to encourage sighted users to describe their photos. Perhaps, in time, Facebook’s AI will learn from these descriptions. I have reached out to Facebook several times, explaining the value of human-generated image descriptions, but have not received a substantial response.
I, along with several other, like-minded blind users, urge Facebook to implement an approach to image description that is similar to Twitter’s, in that it offers a space for users to describe their photos. This approach would not only create a more inclusive Facebook, but it would also encourage us to imagine a world in which providing image descriptions is a pleasure, rather than a burden.