5 UXR AI-Ethicist Theories — Part 2

Dawn Proitz de Voir
6 min readOct 5, 2024

--

Why is UX Research AI-Ethics?

Originally published in UserTesting’s Insights Unlocked: Season 9 — Episode 109, on March 18, 2024

Deanna Troi in Star Trek: The Next Generation during the episode titled “Thine Own Self” (Season 7, Episode 16). In this episode, Deanna Troi is undergoing the Starfleet Bridge Officer’s test, which she needs to pass in order to gain a promotion to the rank of Commander.

There are many examples of how AI and machine learning can be used to inform UX research and design, but how will UX research inform how AI and ML models are created and deployed to the public?

Ecological validity in machine learning

As a specialist at the intersection of UX research and machine learning, she emphasizes the importance of ecological validity, ensuring that the model’s conclusions actually apply to the real-world problem it intends to solve.

“And they’re not just really fanciful displays of human intellect with regard to the performance of the model,” she said.

She said the other way UX researchers can improve a model’s performance is by collaborating with machine learning professionals throughout the process.

“And, to me, these are almost the same issue because they do converge at some point” Dawn said. “But there are five main ways I think UX researchers need to collaborate with machine learning professionals.”

5 AI-UXR Theories: “P.R.I.D.E.” conversation touchpoints

Dawn uses the acronym PRIDE as a framework for those five ways for ML collaboration:

  • Problem
  • Representation
  • Interpretability
  • Data leakage
  • Evaluation metrics

“Those five touchpoints in a conversation, or over many conversations, between a UX researcher and a machine learning professional can increase the ecological validity of the model and increase the performance of the model,” she said. “I just think that not a lot of researchers believe they can help these really talented engineers do their job better, but also that it’s really almost unethical if they don’t.”

P — The Problem touchpoint for UX researchers and ML engineers

With regard to the problem, “UX researchers are always looking to make sure that the problem is centered on the user,” Dawn said. “And this is a little bit different [with machine learning models]. When a machine learning scientist talks about a problem type, they are actually talking about the solution.”

Researchers, she said, should flip it around. When a machine learning professional says they have a problem type, Dawn said, they are probably talking about one of three things: classification, regression or clustering problems. “That’s how the data should appear to the humans, not necessarily how the human thinks about the problem” she said.

An example would be a movie recommendation algorithm. An ML engineer may design the model to deliver movie recommendations based on how a movie is related to another movie (clustering). But the user really wants recommendations based on a thumbs up or star rating (regression). That would require changing the entire model and become very expensive, Dawn said.

By better understanding what the user wants, needs or expects from a model will help inform how the model is built (sound familiar?).

R — The Representation touchpoint for creating better ML models

Dawn said the representation touchpoint, for a machine learning professional, is about feature selection, feature modeling, and feature engineering.

“And so what the UX researcher should hear is variable creation and what variables matter and what inputs matter,” Dawn said.

Challenges include avoiding the curse of dimensionality where too many features (variables) make the model overly complex. So, Dawn said, it is important for UX researchers to provide input to the ML engineers on what features matter.

“UX researchers often have these libraries where they can tag things,” Dawn said. “I love UserTesting for this, where they can make highlight reels, and those highlight reels have the hashtags that sync between the highlight reels. And then you can just look up a hashtag and see all the videos for that.”

“Over time, researchers can offer this data and say, ‘this hashtag/variable is a huge feature that I think has to be in your model or you need to engineer this variable,” she said. “And the way they’ll do that is will take a lot of variables and mathematically make that variable for you with a larger sample size.”

Dawn said there are also some sticky dilemmas that you get into with representation.

For instance, with our movie recommendation model, a feature could be age and if the model thinks you’re 5-years old, you will get a large number of Disney recommendations but you may not see those if the model doesn’t think you’re five.

“And that’s where UX researchers really need to be at the table and say, ‘you know, my persona is really sensitive about this issue,’” Dawn said.

I — The Interpretability touchpoint for creating better ML models

The trade off for a machine learning scientist is interpretability versus model complexity.

“They can make a really accurate model that’s super great and magical, but then it becomes completely uninterpretable,” she said. And in certain scenarios, she said, that can cause user errors when a user can’t understand what’s being fed back to them.

The example she used is being at a dinner party and having someone you’ve never met come up to you and start sharing information about you and making pushy recommendations. It would be an awkward experience.

“And I don’t think we want to start giving that up, where we just start blindly following whatever chatbots tell us,” Dawn said. “We need to have some rationale. They should be able to cite reliable sources. It is unbelievable that we have such a low bar for the machine learning process to back up what they think is an excellent performance.”

D — The Data leakage touchpoint for creating better ML models

In the context of machine learning basics, data leakage is not about privacy. “I know that sounds like privacy when people hear about it, especially colloquially, but actually it’s about when there are variables that shouldn’t be in the training data. They’re actually more about future information and they can sneak into the training data unbeknownst to the ML engineer because they don’t know very much about the features.”

An example of this, Dawn said, is using patient IDs in an ML model. But a UX researcher needs to know how that patient ID was created. If it was created by seeing a specialist, say an oncologist, and if that ID looks different then the training data or the training period is looking overly optimistic because it has this silent information or this latent variable built within the patient ID.

“Avoiding that is really about knowing how the variables got created in the first place, knowing how every human interacted with that,” Dawn said.

E — The Evaluation Metrics touchpoint for creating better ML models

Evaluation metrics are really important for UX researchers because what you want to know is whether you need to avoid false positives or avoid false negatives, Dawn said.

For example, if you are detecting cancer then having a false positive (ex: a lab diagnostic telling you that have cancer when you don’t, is not as bad as not being told that you don’t have cancer when you, in fact, do.) In that scenario, you want to avoid false negatives and err on a false positive.

But that’s different if you’re testing an interface for police officers to detect crime in an area. In that scenario, you want to avoid false positivess, an err on a false negative (assuming that someone is not guilty, even if they are in fact guilty).

There can be obvious cases, such as these two scenarios, but what happens in the middle where you’re not sure whether it should be a false positive or a false negative?

“That’s called a balanced data problem,” Dawn said. “And there are evaluation metrics for that. That’s why it’s important for you to collaborate. Your ML professional should know all this because they get trained on it [the terminology and how to apply it mechanically] but they don’t know whether the users want you to avoid false negatives or false positives, except in extreme cases such as medical diagnoses and crime detection.”

Source:

https://www.usertesting.com/resources/podcast/UX-research-for-machine-learning

--

--