New Editors & Good Reads in Human-Centered AI

by Justin Weisz (IBM Research AI, US), Werner Geyer (IBM Research AI, US), Elizabeth Watkins (Intel Labs, US), Daniel Buschek (University of Bayreuth, Germany), Qiaosi (Chelsea) Wang (Georgia Tech, US), and Alan Said (University of Gothenburg, Sweden)

Justin Weisz
Human-Centered AI
11 min readMay 9, 2024

--

Image credit: Dreamstudio.ai

We launched this Medium publication on human-centered AI roughly one year ago. In that time, we’ve published a number of interesting and thought-provoking pieces that capture the state of the field, from a long-standing debate on anthropomorphism in AI systems to different ways of interacting with large language models to coaching AI technology teams to operate in a more human-centered fashion.

We are excited to begin our second year and publish more pieces that distill, focus, and challenge the discourse in human-centered AI. We would also like to announce some changes to our editorial board.

First, several of our founding editors will be rolling off the board. We thank them tremendously for their service and contributions in helping get this publication off the ground.

Justin Weisz (IBM Research AI, US) and Werner Geyer (IBM Research AI, US) will remain on the board, and they will be joined by Elizabeth Watkins (Intel Labs, US), Daniel Buschek (University of Bayreuth, Germany), Qiaosi (Chelsea) Wang (Georgia Tech, US), and Alan Said (University of Gothenburg, Sweden).

To introduce our new (and existing!) editors, we asked them to write a short description of themselves and their research interests in human-centered AI. We also asked them to pick out one of their favorite, recently-read works — a paper, a book, an article, or anything else — and explain why they enjoyed it. We hope this provides an inspiring collection of “good reads” for topics in human-centered AI!

Daniel Buschek

Hi, I’m a professor for Intelligent User Interfaces at the University of Bayreuth, Germany. My group explores human‑AI interaction and its impact on people and digital work. Our goal is to empower people in their creative digital work and shape the future of AI tools in a human‑centered way. This goal includes designing AI tools that facilitate thinking, instead of replacing it; building user interfaces that make AI work for a diverse range of people; and critically revealing patterns & effects of working with AI. In particular, our recent work focuses on writing and working with text documents. You can find it on our group page here.

In that light, I’d like to recommend Track Changes: A Literary History of Word Processing by Matthew G. Kirschenbaum. It covers the history of text processors and is full of anecdotes about how (famous) writers reacted to the emergence of digital text processing. From an HCI point of view, reading this feels a bit like “doing user research in a book,” with people you would otherwise not reach. I also found it super interesting to spot the parallels to reactions and discussions about new ways of writing (with AI) today.

Considering recent papers, I really like Gu et al.’s paper on text rendering, which will be presented at CHI 2024. You can immediately appreciate the results of their method when looking at the first page of the paper. I can see this technique being applied in various user interfaces and interactive systems for working with text.

Werner Geyer

Hi, I’m a Principal Research Scientist, Global Strategy Lead for Human-Centered AI, and innovation leader at IBM Research. I am an ACM Distinguished Scientist. I work with an interdisciplinary team of HCI & ML & Visualization researchers, software engineers, and designers to explore Human-AI interaction. Most recently, my work focuses on generative AI in the context of Human-Centered AI addressing two key questions:

1. How can we design & create novel user experiences that lead to effective human-AI interactions, given the intent-based nature of generative AI models, and

2. How can we create trustworthy and safe AI experiences given the new harms and risks that come with generative AI systems?

As large language models and generative AI systems are increasingly becoming a commodity, more than ever, the user experience of the applications we build with generative AI will be a key success factor in those systems leading to positive synergies between humans and AI systems. I am a strong believer that it takes both algorithmic and UX breakthroughs for technologies to be successful. Much of my research is looking into exploring end-to-end experiences. I have also worked on AI-infused collaboration, social computing, and recommender systems in the past and am very excited about tech for personal productivity. You can read more about me on wernergeyer.com.

The latest trend in generative AI are agentic workflows. Andrew Ng provides a good overview of agentic design patterns in 5 letters he recently wrote in The Batch. If you haven’t read them, I highly recommend taking a look. They provide a nice and short, high-level intro of the topic. The reason why I’m mentioning the patterns here is that reflection is listed as one of the design patterns in which LLMs examine their own work and come up with ways to improve it. This is related to a more recent topic I am working on right now, LLM-as-a-Judge. One paper in that space presented at CHI 2024 is worth looking into: EvalLM: Interactive Evaluation of Large Language Mode Prompts on User-Defined Criteria. It provides a great HCI perspective on the area with lots of background info and related work. The authors propose a cool idea for an interactive system to refine prompts based on user-defined criteria for an LLM judge and they conduct both formative and evaluative user studies. The paper also makes a good case for the importance of user-defined criteria versus general benchmarks as evaluations are often very use case and context-specific.

Alan Said

Hi! I am an Associate Professor of Computer Science at the University of Gothenburg. My work has mostly been in the recommender systems and user modeling spaces, but I’m always thinking of users and their experience with recommendation and personalization systems. My most recent works focus on explaining recommendations using LLMs, and how trust is established in personalization systems. I have been working in the wider space of HCAI in recent years, and recently developed the MSc program in HCAI at the University of Gothenburg (our first cohort of students was enrolled in 2023! 🙂). I am also an occasional podcaster and co-host of the HCAI podcast.

Recent works that I’ve enjoyed reading include Nauta et al.’s CVPR 2023 paper on human-understandable global explanations for image classification helping make complex models interpretable, and Mujlwijk et al.’s IUI 2024 paper on human-AI interaction for expert users which showcases how interaction with AI can help domain experts making better decisions. Both papers capture why a human perspective on AI is important and beneficial when working with intelligent systems.

Qiaosi (Chelsea) Wang

Hi, I’m Chelsea. I am a final-year Ph.D. Candidate working with Dr. Ashok K. Goel at Georgia Tech’s Human-Centered Computing Program. I am broadly interested in human-AI interaction, responsible AI, and cognitive science. My work so far has focused on exploring the human perspectives on advanced AI systems that are playing diverse social roles (e.g., teaching assistants or matchmakers) in our daily lives. In my Ph.D. work, I explore human-AI interaction through the lens of “Mutual Theory of Mind” where both humans and AIs are continuously interpreting each others’ behaviors to make inferences about each others’ internal states throughout interactions. I examine questions such as, how can AI automatically construct a user’s perceptions of AI, what is the impact of AI misinterpreting a user’s characteristics, and how can we integrate human perceptions of AI into the responsible design and development of AI systems. I am very excited to present some of my recent work on this at our own CHI 2024 workshop on Theory of Mind in Human-AI Interaction!

As a final-year Ph.D. candidate, I have been thinking and working hard on my Ph.D. thesis on Mutual Theory of Mind. So I have read a lot of papers on Theory of Mind and Human-AI Interaction 🙂. One paper that is close to my heart (and my thesis 😛) is “Conveying Intention by Motions with Awareness of Information Asymmetry” by Yosuke Fukuchi and others. When I look at human-AI communication through the lens of Mutual Theory of Mind, I believe most of what we (human-centered AI researchers) are doing is either enhancing humans’ understanding of AI systems by manipulating or designing the AI’s characteristics and outputs (e.g., explainable AI), or building techniques for AIs to better detect and model human characteristics to understand what’s on the humans’ minds. I think this paper does a wonderful job at articulating and positioning these ideas as an information asymmetry issue, as a gap between humans’ and AI’s beliefs about each other, based on different observations about each other. The authors point out that it is important for the AI “to choose actions with awareness of how its behavior will be considered by humans.” And they do this using a Bayesian public self-awareness model that allows the AI to generate intention-conveying motions while considering information asymmetry. This was a great paper that consists of thought-provoking theoretical, design, and technical implications for all human-AI interaction researchers!

Elizabeth Anne Watkins

Hi everybody! I’m Elizabeth, and I’m thrilled to join the editorial board of this publication! In my work I wear two hats: I’m a Research Scientist in the Social Science of AI at Intel Labs; I work within the Intelligent Systems Research group on the Socio-Technical Systems team. I also serve on the Responsible AI Council.

As a Research Scientist, I work in cross functional teams for system development with engineers, AI scientists, and UX designers. I use empirical social-science tools to learn more about the human side of trust, transparency, and explainability, both cognitively and behaviorally. I examine questions such as how do people arrive at trust decisions, how do they decide what AI systems are really for and for whom, and what practices do they create to fit AI systems into their daily lives? I’m delighted to be a leading organizer at the CHI 2024 workshop on Human-Centered Explainable AI. I’ll also present a paper there on a participatory AI project we’re building at Intel Labs.

I also serve a governance function with the Responsible AI Council, which is a multidisciplinary group that provides recommendations for AI development projects across the entire organization of Intel. In my role, I help to operationalize HCAI and socio-technical research into ethics assessments and recommendations. Ultimately, this approach helps to ensure systems align with our Responsible AI principles including enabling human oversight, transparency and explainability, and sustainability. Recently, we’re focusing on building ethical guidelines for generative AI, to ensure that these powerful systems are built into tools that people can understand, make informed decisions about whether they trust those tools and whether they fit in with their values and goals, and if so, ensure they can use these tools productively in their workplaces, homes, and schools.

My favorite recent paper is from Sunnie Kim, Q. Vera Liao, Mihaela Vorvoreanu, Stephanie Ballard, and Jennifer Wortman Vaughan: “I’m Not Sure, But…”: Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust”, coming up at FAccT 2024 in Brazil. The authors argue that uncertainty, a characteristic of AI outputs commonly associated with explainability, can be communicated in LLM-based interfaces via natural language expressions. Fascinatingly, one of their findings is on the impact of anthropomorphic tone in these expressions; when the LLM provided a first-person uncertainty expression of “I’m not sure….”, this had more impact on overreliance in a search-engine-related task — it was more successful at preventing inappropriate overreliance on AI-generated predictions than the more generalized, “It’s not clear…”. These findings are in conversation with prior entries on this very blog about anthropomorphization — a fantastic read for anyone working at the intersection of humans and AI!

Justin Weisz

👋 Hi, I’m Justin. I’m a Senior Research Scientist at IBM Research where I help drive our human-centered AI strategy. I’m also the Editor in Chief of this publication, which grew out of a number of conversations I had with Werner Geyer about how we can build a stronger community around human-centered AI across academia and industry. In my research, I focus on trying to understand how to design ways for people to work effectively and safely with AI systems (and most recently, generative AI systems). It’s clear that you can’t just “throw AI” at users and expect them to be productive with it — we’ve seen time and again that sometimes AI leads to synergistic outcomes, and sometimes it doesn’t (Camparo et al. 2022 provide a nice survey on human-AI synergy, and Weisz et al. 2022 finds individual differences in effectiveness working with AI in a code translation scenario). My current hypothesis is that, if people and AI systems had a better understanding of each other — their knowledge, skills, capabilities, and goals — it would lead to more synergistic outcomes. This idea has been referred to as mutual theory of mind (MToM) and it’s an area I’m really excited about. I’m also honored to be part of the team that is hosting the first Workshop on Theory of Mind in Human-AI Interaction at CHI 2024!

My favorite recently-read work has been The Alignment Problem by Brian Christian. I’m not a huge fan of the word “alignment” to describe AI systems that don’t perform according to human expectations or desires (and apparently, I’m in good company). For example, when my son doesn’t behave in a way I expect or desire, I don’t say his behavior is “misaligned,” I say that he is “misbehaving” or that his behavior is “unacceptable.” Nonetheless, this book clearly articulates why it’s hard for us to get AI systems to behave in ways we want them to. ML researchers often talk about an “objective function” (or a “reward function” in RL parlance), and this book gives plenty of examples of how optimizing for a well-intentioned reward function can go wrong. My favorites were about how RL algorithms are often tuned to maximize the score in a video game; this approach works well for some games where score is a strong indicator of progress, but not for others where the score doesn’t tell you much about how much progress you’ve made in the game. And for some games, score rewards are sparse and players only gain points when they perfectly execute a complex behavior (i.e., Montezuma’s Revenge). In these cases, reward functions that optimize for encountering novel situations, such as exploring new areas of the map, work a lot better.

This book makes it really clear just how hard it is for people to specify what they really want in a way that is precise enough for AI models. But, this is a really important problem now that the primary mode with which we interact with computing technologies is shifting away from issuing commands and toward specifying desired outcomes. So, I think this is a great book for HCAI researchers and practitioners to read!

Get Involved

Looking for ways to get involved in the human-centered AI research community? We recommend:

--

--

Justin Weisz
Human-Centered AI

Manager, Senior Research Scientist, & Strategy Lead, Human-Centered AI at IBM Research. My opinions are my own.