The Psychology of ChatGPT

Alexandra Jayeun Lee
UXR @ Microsoft
Published in
9 min readJul 18, 2023

Not only has Open AI’s ChatGPT (a chat-based Generative Pre-trained Transformer) heralded in the new era of user interface, its adoption rate has surprised everyone. With broad adoption of such Large Language Models in ubiquitous spaces that were previously untouched by automation, there’s still much to learn about how to best adapt digital assistance dimension into existing ecosystem of products and services we use in ways that enhances, rather than degrades, the quality of life of those who use them. Personally, I think our job as product researchers has never been more important, though, of course, there’s no shortage of articles (like here, here, and here) that collectively paint a gloomy picture of what’s ahead for humanity. Well, let’s start with the good news: researcher’s job has become even more valuable than before. The following article from WSJ is a case in point. Asking questions is already a significant part of the product researchers’ day job.

Now, are you ready for the bad news? As of this writing, we are still ways to go before achieving Artificial General Intelligence (or AGI for short), though depending on who you’re talking to, one could argue that Moore’s Law is all but dead and this is the moment where the very concept of infinite doubling of compute power in the semiconductor sector (which was all about cramming in as much transistors in a single chip), is now applicable to the vast amount of digital information. Hot takes on synthetic users, potential harms of LLMs, and even the recent testimony from Open AI’s CEO where he conceded that this technology needs federal regulation, make the collective responsibility those of us charged with the job of bringing these capabilities to the world seem insurmountable. But we can address this head on, one bias at a time.

Microsoft had a frontrow seat view of how to integrate large language models (LLMs) into its ecosystem of products and services for years — alongside other large IT conglomerates that have the means to learn from its dedicated community of data scientists and engineers — well before the announcement of its multi-billion dollar partnership with Open AI last November. Since then, there have been many opportunities to learn directly from our internal community of technocrats and no shortage of interest in learning more about its capabilities and how to learn to do basic ‘prompt engineering’. But these sessions always felt like something was missing, which has led me to conduct my own investigation, and put together what I’ve learned to date in an interactive workshop format for our internal community of interaction designers and researchers. The format of the workshop was inspired by the way Microsoft’s VP of Design and Artificial Intelligence John Maeda teaches everyone about complex concepts by using concepts like ‘the ketchup bottle moment’ and ‘cooking with Semantics Kernels’. So, without further ado, let’s cook!

Prompt engineering recipe for researchers poster (Lee, 2023)

What is the first step in cooking once you know what you want to make? Gathering all of the ingredients, of course. I am calling this part the Dirty Dozen, the twelve biases of ChatGPT that product researchers (but really anyone looking to get better results out of it) need to be aware of.

1. ChatGPT is Politically Left Leaning

Image source: D. Rozado, The political Biases of Chat GPT, Social Sciences (2076–0760), 2023

In a recent paper that explores algorithmic biases of Large Language Models (LLMs), David Rozado (2023) administered 15 political orientation tests to ChatGPT and discovered that 14 of 15 tests manifested in left-leaning political viewpoints, despite the claims of political neutrality. AI systems like ChatGPT can, “claim political neutrality and factual accuracy while displaying political biases on largely normative questions have potential for shaping human perceptions and thereby exerting societal control”, which presents challenges for users who are seeking more balanced results in their output.

2. ChatGPT is not AI

ChatGPT and other interactive AI tools built on LLMs work by generating a block of text by synthesizing a closed-box LLM to map as closely to the information requested by the user.

A simple diagram of how ChatGPT works (Lee, 2023)

The simplest way of understanding how ChatGPT works, other than by asking ChatGPT yourself, is to think of it like a pattern synthesis engine built on the premise that by providing a large data set to train the model may increase the likelihood of correctly predicting the results of what the user is asking for. We need to acknowledge that in its current form, ChatGPT doesn’t just hallucinate sometimes, but that it hallucinates by design, which brings me to the next point —

3. ChatGPT Hallucinates

Source: OpenAI Blog, November 2022

Some of the common tendencies of the current LLMs to mimic human dialogue include producing excessively verbose responses, since longer answers can look more comprehensive at first glance, but this is akin to when someone who doesn’t know what they’re talking about trying to waffle their way out of it. There are also many anecdotes out there where users notice inaccurate information about people, places, or facts buried within content that seems directionally correct and proficient.

4. ChatGPT Doesn’t Know User’s Intent

On top of that, because its job is to recognize and produce patterns, it does not have the ability to ask deeper questions of the user beyond the surface level. This is one of the reaons why thoughtful engineering of prompts is important and we are seeing ‘prompt engineering’ as a skillset on the rise.

So you might extend an invitation to ChatGPT to have a drink or two, but you wouldn’t necessarily want to bring it into your therapist’s office. If you’re seeking a chatbot-like therapy experience, I suggest giving Woebot a try.

5. ChatGPT is Anthropomorphized

We like to think of ChatGPT as a reasoning engine, but it’s still going to be some time before we reach the Artificial General Intelligence moment to the same level so often romanticized in Sci-Fi books and movies. Again, ChatGPT is not AI, technically speaking.

“Intelligence is a misnomer”, said Em Ivers, a product designer at Microsoft who specializes in chatbots. “ While it can seem intelligent and process inputs in a way that seems similar to human reasoning, there isn’t a ‘thinking’ entity behind it. Our own real brains are wired to presume that language is attached to another person, but that’s anthropomorphizing. Expectation and reality gap in how we attribute qualities to objects that it doesn’t have.”

If you’ve ever found yourself in posession of a bag of peel-and-stick googly eyes and walked around your neighborhood sticking it onto random things like fire hydrants, tree leaves, and your neighbor’s trash cans, you know that you’ve been anthropomorphizing.

Photo by Sarah Mutter on Unsplash

6. ChatGPT has Selection Bias

It is very important for us to engineer inclusion into engines like ChatGPT rather than accept its output at face value. ChatGPT is designed to produce the most likely patterns in its output, which, as we already established, may not necessarily be factual, but it doesn’t stop itself from making up citations that don’t exist. By design, it mimics human reasoning by learning the common patterns. Inclusive growth takes a whole new meaning when you realize the bias in the source data, since it likely reinforces and perpetuates the perspectives of W.E.I.R.D (Western, Educated, Industrialized, Rich, Democratic) data.

Image Source: xkcd comics https://xkcd.com/2618/

Did I mention that ChatGPT has no intelligence?

7. ChatGPT Suffers from Dunning Kruger Effect

The following illustration is how OpenAI team explains the process of training the model, via Reinforcement Learning from Human Feedback (RLHF) and Proximal Policy Optimization (PPO) to ‘reward’ the model for higher quality responses. Its job is to produce missing tokens. A single token in this context can be thought of as a syllabel in a word. Counting these tokens algorithmically is how we can estimate what information is being processed. A good analogy for this is to think about how public utility is measured. In a given billing period, you measure your household’s water usage once at the tap, and again at the drain.

Image Source: OpenAI blog, 2022

8. ChatGPT has a Recency Bias

ChatGPT’s output quality depends on two elements: the quality of user’s input, and the quality of the language model that it’s trained on.

Source Image: Finding Nemo, 2003

Multi-shot prompts — a technical term to describe multiple conversational turns you take on a given topic — can lose accuracy and value the longer you try to carry on the conversation, because it has a tendency to forget the earlier part of the conversation. The first public version of ChatGPT launched in late 2022 only remembers the last 3,000–4,000 words.

9. ChatGPT is a Snapshot of the Past

The other limitation of LLMs in current form is that it is a closed-box system, which means that it is trained all at once, within a set time period. While the model can continue to improve with user input and guidance, it often gets historical dates wrong and doesn’t always have the most recent information available.

A snapshot of my conversation with ChatGPT

10. ChatGPT is Trained on the “Internet”

In addition, ChatGPT doesn’t understand the real world constraints beyond what’s available online. To boot, it’s also trained on data we do not necessarily want in our results, such as opinion-based social media sites like Twitter and Reddit. The ethics of scraping publicly available data is an evolving debate, as are intentional application of solutions that may have downstream consequences that we don’t want to see. Below is a framework by Indy Young that provides a solid starting point for thinking about potential harms to people.

Source: Indy Young, “Insta-Personas & Synthetic Users”, LinkedIn, 2023

If you’re interested in learning more about the principles of Responsible AI and ways that Microsoft is mitigating harmful usage of products like ChatGPT, this is a good place to start.

11. Semmelweis Reflex: When AI Doesn’t Belong Everywhere

What is appropriately anthropomorphic about ChatGPT is that it suffers from the Semmelweis effect, a human behavioral tendency to stick to preexisting beliefs and to reject fresh ideas that contradict them.

Image Source: Don McMinn, 2017

12. The Sunk Cost Fallacy

The rising ubiquity of intent-based interaciton tools like ChatGPT means we need to sanity check the techno-centrism of our decisions that got us here. The sunk cost fallacy is a psychological barrier that ties people to unsuccessful endeavors simply because they’ve committed resources to it.

These are the “Dirty Dozen” of biases that ChatGPT brings to users when you use it out-of-the-box without any additional training and customization. It is a powerful reasoning engine. Being aware of these biases can help us set healthier expectations on what it is capable of, and nudge us to consider use cases that add creative value to our existing processes, and provide good return on investment.

Before we close out, there is one more point that is not necessarily related to the psychogical aspect of ChatGPT, but knowing this may shift our view when weighing the cost and benefit of continuing to use powerful reasoning engines like ChatGPT:

13. ChatGPT Won’t Help the Climate

Today we need 5 sets of 80 Gigabites of A100 series GPUs (Graphics Processing Units) just to load the model and text. GPUs have high power consumption and heat generation, and the data centers they run in require constant cooling to keep the processors functioning. The environmental controls that keep the data centers operating consumes just as much power as the servers themselves, which is staggering.

Source: Scanrail, Getty Images

And last but not least, I would like to invite you to see how many of these biases you can ask ChatGPT to come up with, by going over to Bing AI chat and click on ‘chat now’ button. If you find any others that weren’t mentioned in this post, please leave a comment below.

--

--

UXR @ Microsoft
UXR @ Microsoft

Published in UXR @ Microsoft

This collection of articles showcases the ongoing work of user experience research at Microsoft. Our community represents user experience researchers, designers, program managers, and engineers who are developing products with users at the center.

Alexandra Jayeun Lee
Alexandra Jayeun Lee

Written by Alexandra Jayeun Lee

Researcher @Microsoft | Resilience Geek | formerly @CivicDesignLab