By Leah Burrows, Harvard SEAS Office of Communications
When a human and an artificial intelligence system work together, who rubs off on whom? It’s long been thought that the more AI interacts with and learns from humans, the more human-like those systems become. But what if the reverse is happening? What if some AI systems are making humans more machine-like?
In a recent paper, researchers from the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) explored how predictive text systems — the programs on our phones and computers that suggest words or phrases in our text messages and email — change how we write. The researchers found that when people use these systems, their writing becomes more succinct, more predictable, and less colorful (literally).
The paper was accepted for presentation to the Association for Computing Machinery’s Intelligent User Interfaces (ACM IUI) conference.
“We’ve known for a while that these systems change how we write, in terms of speed and accuracy, but relatively little was known about how these systems change what we write,” said Kenneth Arnold, a Ph.D. candidate at SEAS and first author of the paper.
Arnold, with co-authors Krysta Chauncey, of Charles River Analytics, and Krzysztof Gajos, the Gordon McKay Professor of Computer Science at SEAS, ran experiments asking participants to write descriptive captions for photographs. The study compared three scenarios:
· Captions written with the suggestions hidden
· Captions written with programs that always suggested three next words, like iOS and Android smartphones currently do
· Captions written with programs that made suggestions only when the system had high confidence in the next word
When they compared the captions, the researchers found that captions written with suggestions were shorter and included fewer unexpected words than captions written without suggestions. Specifically, captions written with suggestions included fewer adjectives, including color words like “green” or “blue”.
For example, one photograph showed a baseball player swinging a bat. Rather than suggesting “baseball player” or “hitter”, the predictive program suggested the much more common word, “man”. Here is a caption for that image written with suggestions shown: a man is swinging his bat at a baseball game
Here is a caption for the same picture, written with suggestions hidden: a baseball player wearing number eight swings a bat with people watching from the dugout
Here is a caption for the below picture, written when predictive suggestions were shown: A train pulling into a quaint train station.
Here is a caption for the same picture, written with suggestions hidden: An old brown train pulling away from a small train station by a baby blue building.
Even unprompted, the writer may have written the word “train” — but likely only after descriptive adjectives like “brown” or “small”. By suggesting nouns like “train” immediately after the article, rather than adjectives, the AI system often caused writers to omit those words altogether. The researchers dubbed this “skip nudging”.
So, with suggestions hidden, a participant captioned the below image as: numerous kids and accompanying grownups fly colorful animal-shaped kites at the beach.
A different participant was shown predictive suggestions and captioned the image as: people standing on a beach flying colorful kites.
“In short, predictive text suggestions — even when presented as single words — are taken as suggestions of what to write,” said Arnold. “While, for the most part, people wrote more efficiently with predictive text systems, this may have come at the cost of thoughtfulness. These kinds of effects would never have been noticed by traditional ways of evaluating text entry systems, which treat people like transcribing machines and ignore human thoughtfulness. Designers need to evaluate the systems that they make in a way that treats users more like whole people.”
“This research provides compelling evidence that intelligent systems designed to improve the efficiency of human work frequently also impact the content of human work, and they do so in unanticipated ways,” said Gajos. “We had missed this before because, as a research community, we tend to evaluate novel technologies in simplified settings that do not always reflect the real use. Ken’s work demonstrates the importance of testing these innovations with real people doing real work.”
This story was written by Leah Burrows for Harvard SEAS News about the following paper we have recently published:
Kenneth C. Arnold, Krysta Chauncey, and Krzysztof Z. Gajos. Predictive Text Encourages Predictable Writing. In Proceedings of the 25th International Conference on Intelligent User Interfaces, IUI ’20, page 128–138, New York, NY, USA, 2020. Association for Computing Machinery.
Several other papers are also relevant to the story:
Zana Buçinca, Phoebe Lin, Krzysztof Z. Gajos, and Elena L. Glassman. Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating Explainable AI Systems. In Proceedings of the 25th International Conference on Intelligent User Interfaces, IUI ’20, page 454–464, New York, NY, USA, 2020. Association for Computing Machinery.
Kenneth Arnold, Krysta Chauncey, and Krzysztof Gajos. Sentiment Bias in Predictive Text Recommendations Results in Biased Writing. In Proceedings of Graphics Interface 2018, GI 2018, pages 33–40. Canadian Human-Computer Communications Society / Societe canadienne du dialogue humain-machine, 2018.
Kenneth C. Arnold, Krzysztof Z. Gajos, and Adam T. Kalai. On Suggesting Phrases vs. Predicting Words for Mobile Text Composition. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology, UIST ’16, pages 603–608, New York, NY, USA, 2016. ACM.