Late-Breaking Work highlights CMU researchers’ playful approach to a human understanding of AI
Researchers at Carnegie Mellon University are engaging people with AI through their creation of an online computer vision game. The team is part of the Data Interaction Group in the Human-Computer Interaction Institute, led by Dr. Adam Perer.
Their research offers a new approach to show humans how AI interprets images. Their paper, “Getting Playful with Explainable AI: Games with a Purpose to Improve Human Understanding of AI” was selected as a Late-Breaking Work (LBW) to be presented at the ACM CHI Conference. The LBW paper is “a concise report of recent findings or other types of innovative or thought-provoking work relevant to the CHI community” — ACM CHI 2020.
Explainable AI seeks to improve human understanding of how models make decisions
Explainable AI (XAI) is increasingly important in bringing transparency to fields such as medicine and criminal justice where AI informs high consequence decisions. While many XAI techniques have been proposed, few have been evaluated beyond anecdotal evidence.
What is challenging about complex AI systems is that they lack the ability to self-explain their thought processes. Making models explainable is a prerequisite for building trust and understanding of AI systems. XAI requires human evaluation at scale, and Games with a Purpose (GWAPs) can be used for XAI tasks.
GWAPs provide a playground to gather data at scale that can benefit XAI
Our research specifically looks at the application of GWAPs for explainable AI. GWAPs have been shown to be highly effective at collecting and verifying large data sets of data generated for online gameplay. They are especially good for human computational problems such as generating descriptive keywords and labeling information.
A few examples of GWAP include the original ESP game which was developed at CMU. It held the idea that humans are good at recognition while computers struggle and through a game got players to provide descriptive words for images.
Other examples include Foldit which is GWAP popularized for crowdsourcing for biophysics research and other games like– TagATune descriptive keywords for music labeling or TileAttack for NLP labeling.
Scope: GWAP for describing feature visualizations
Through GWAP for XAI, the research team is interested to learn how players interpret feature visualizations produced by AI.
Feature visualizations are popular visualization techniques produced by researchers to explain image recognition tasks. The visualizations expose clues to how neural networks may behave and image recognition was identified as a relevant area where explanations could be qualified.
Excerpts from CHI Paper: Details of the Game
Learned Feature Visualizations
Feature visualizations were produced by making neural network interpretation of images visible. Over the course of multiple layers, abstractions are built from detected edges, textures, patterns, and parts of an image. We computed feature visualizations from images of animals and objects using the Lucid library.
One player, the “explainer,” is given a source image. The explainer selects the top explanations that they believe will lead the other players to guess the correct answer (e.g. parrot) as quickly as possible. Two other players are “guessers,” who compete against each other to guess what the feature visualizations represent.
Points as Incentives
Guessers receive one visual explanation to start, with a new explanation revealed after a set amount of time. The quicker a guesser identifies the correct answer, the more points the guesser and explainer gain. If the guessers are unable to guess after the reveals, they receive a text hint.
Transactive design was utilized where all guesses are visible to the explainer and to the guessers. This allows each guesser to build on their and other players’ guesses. Providing accurate data is the ideal way to play this game because both explainers and guessers aim to get to a correct guess as quickly as possible. A hypothesis is that guessers are highly motivated not only by being quick to guess but by being the first to guess, as only one of the two guessers receives points. Early pilot testing suggested that players find reaching an agreement quickly to be satisfying and valuable.
The game generates relevant data for explainable AI in two ways. First, the explainer is given ten visual explanations to select from, of varying explanatory quality as judged by our algorithms, and the explainer is only allowed to select four of the ten. Explainers also dictate the order in which the images are shown to guessers. Of the four visualizations they select, they are instructed to select the most helpful explanation first. Second, the guessers type guesses conveying how they interpret the visualizations and give information about which image(s) help them guess correctly. Guessers provide copious and timely data: they can guess as often as they want within a time limit.
Versions of the game capitalize on opportunities for divergence and transactivity by controlling whether the players can see each others’ guesses. If the explainer can see all guesses, but players can only see their own, a divergence supporting system is tested; if players can see each others’ guesses and are incentivized to build on each other’s contributions, a transactive supporting system is tested.
Playtesting and Next Steps
To begin answering research questions, collecting both quantitative web log data while users are playing and qualitative data when users finish individual games and rounds of play is important.
Since the LBW submission, game categories have expanded to include new types of images. These include introducing several categories — types of musical instruments, plants, sports equipment, transportation, etc. — that players can select as part of the game.
Keep an eye out for our upcoming game version “Eye into AI.”
Interested to playtest the game? We hope to launch a public version in the coming weeks, but you can get a sneak preview by filling out the form below:
Read the Full Paper
Laura Beth Fulton, Ja Young Lee, Qian Wang, Zhendong Yuan, Jessica Hammer, Adam Perer. Getting Playful with Explainable AI: Games with a Purpose to Improve Human Understanding of AI. CHI ’20 Extended Abstracts, April 25–30, 2020, Honolulu, HI, USA