The surprising reasons why we lack user feedback in the AI cycle

Laura Dohle
IBM Design
Published in
10 min readMay 12, 2023
Header image for article

Co-written by Kinga Parrott (CE BTL and AI Strategist), Laura Dohle (Design Lead) and Maria Sánchez (Senior UX Designer)

User evaluation in uncharted territory

It is hard to bring the voice of the user into any type of software project, even for user research experts. With AI projects, the added layer of technical complexity makes it even more challenging.

As members of our internal Design for AI Guild working across different parts of IBM, we decided to join forces on a lofty goal: How might we enable data scientists — as the main owner of the data science cycle and AI experts — to conduct and consume user feedback?

Although we all work at IBM, our environments are quite different. Kinga and Maria support customers in the fast-paced area of Client Engineering, while Laura works on often long-term roadmap items on the IBM product side. Despite the differences, however, we all face difficult situations on a daily basis caused by what we identify as the lack of regular user feedback during the execution of AI projects.

These situations are what drove us to identify common pain points and find a solution for this challenge. We began our project running a Design Sprint. We kicked it off with a couple of Data Scientist interviews, and a lot of assumptions based on the experience of our daily work.

During the ideation phase, we thought of three solution directions:

Framework extension, toolkit development and education
  1. Framework extension: allow data scientists to get feedback from users at specific points during the model creation process (e.g. to validate the data they plan to use, the selected features, or the outcomes).
  2. Toolkit development: build a collection of tools that data scientists can use to conduct their own evaluative research.
  3. Education: develop training material for data scientists to learn about user research for AI projects.

We shared our initial solution ideas with AI Design Guild leadership, and they raised an important question: “Can you describe the actual underlying problem in a sentence? And do the data scientists even care?”

We realized then that we based the majority of things we thought we knew about the problem on our gut feeling and individual experience. Essentially, we were a room full of people sharing “I think” statements.

We had been convinced that lack of user feedback was a problem for data scientists. But did the data scientists agree? And if so, was it a priority for them? We had neither a deep enough understanding of the problem space or hard data on which to base our decisions. This made it impossible to confidently decide on an appropriate direction for our solution.

We had based all our work on two assumptions:

  • Lack of user evaluation in the data science cycle is a problem for data scientists.
  • The data scientist handles the data science cycle, therefore the data scientist needs the means to conduct user evaluation.

This is the first lesson we want to highlight (and a really basic one at that):

When researching a project, take a step back and double check that your hypothesis makes sense for the real user.

You ≠ user, even when you know a lot about the domain.

So we continued our efforts, validating our assumptions with the data scientists.

Data scientist’s real challenges

Research method

With this “reality slap”, we re-grouped. Our hypothesis wasn’t entirely wrong, we just made too many assumptions along the way and jumped directly to solutioning. We needed to validate and quantify our hypothesis: What are the main challenges that data scientists face when working on data projects inside IBM?

We collaborated closely with design researchers to conduct an online survey of data scientists at IBM to gather insights on the challenges.

To understand the current levels of collaboration, we additionally asked participants to provide the percentage of AI projects in which participants interact with end users. We also prompted them to provide the percentage of AI projects in which participants work with designers, the traditional owners of user evaluation work.

Findings

112 practitioners participated in the survey. To ensure comparability, we only considered the 68 full responses.

The respondents provided details of data scientists’ most significant challenges in working with AI. The three biggest challenges were:

  • not being able to validate the outcome with end users (85%)
  • course corrections late in the project (83%)
  • not having access to the end users for user evaluation and feedback (82%)
Challenges of data scientists
Diagram 2: Challenges of data scientists

35% of data scientists mentioned that they have no interaction with end users on the majority of their projects, including 6% who have no contact with end users at all.

In terms of interactions with designers, the traditional owners of user evaluation work, the majority of data scientists (32%) collaborate with designers on only 10% or less of their projects.

Interactions with end users and designers
Diagram 3: Interactions with end users and designers

The survey results encouraged us to dig even deeper into this topic. Our instinct was correct since the data scientists confirmed that something in the process was broken. The quantitative feedback we received gave us a good sense of direction on where to continue, so we moved into a qualitative research phase.

Why user feedback is so hard to include

Research method

We decided to conduct interviews with chief data scientists and other data science leadership via video conferencing. All in all, we spoke to five data science leaders.

We based the interviews on an interview guide, which was organized around:

  • the key responsibilities of participants in their daily job
  • the three key factors to deliver successful AI projects
  • the top three reasons why an AI solution fails in the market
  • the top three challenges identified in the survey
  • an evaluation of the solution sketches shared in the introduction to this project

Findings

The participants shared what they considered the most significant success factors for AI projects.

Three out of five participants mentioned that strong stakeholder and end-user engagement was crucial.

Also, three out of five participants said that a successful AI project needs a clear goal and intention.

Top reasons why AI projects succeed
Diagram 4: Top reasons why AI projects succeed

In addition, we asked the participants about the top reasons why AI projects fail. The top factors mentioned were not having the right people involved at the right time and insufficient project planning and leadership. Both topics were mentioned by three out of five people.

Top reasons why AI projects fail
Diagram 5: Top reasons why AI projects fail

Finally, we asked the data science leaders to rate our initial three solution ideas from 1 (not impactful at all) to 5 (very impactful).

Framework extension, toolkit development and education

They rated the framework extension 3.4 out of 5, pointing out that the advantage of this solution is that every data scientist is familiar with the framework. Still, it would require support from PM and design — and those resources are tough to get.

The toolkit was rated 2.8 out of 5, indicating that this would be a very practical “How-to” guide. Yet, data scientists would not have time to integrate this extra work into their current workflow.

Lastly, they rated the education module 2.6 out of 5, pointing out that understanding user evaluation should be an essential part of our data science education. Still, it does not serve the core problem: not getting user feedback due to insufficient project setup.

The interviews showed us that data science leaders disagreed with our assumption that data scientists should be the ones conducting the user feedback. The work of the data scientist is all-consuming, as they are already stretched thin. They wouldn’t have time to even educate themselves on the user evaluation process, much less conduct it.

More than just a data scientist problem

Initially, we started this research effort with two assumptions that needed validation:

  • Lack of user evaluation in the data science cycle is a problem for data scientists.
  • The data scientist handles the data science cycle, therefore the data scientist needs the means to conduct user evaluation.

Our research showed that our first assumption can be considered valid. Having no access to users for feedback was the highest ranked challenge for IBM’s data scientists. However, the research also pointed out that this problem consists of two different subitems, which require different solution approaches.

First, especially in our interviews with leadership, we learned that the lack of access to users is not a data scientist problem. It is an issue of project management or even our organisation as a whole. Data scientists themselves simply don’t have the means or the power to establish access to end users on their own. Regularly gathering user feedback needs to be planned into the project from the start. If we do not tackle this problem, progress in terms of a more user-centered data science and AI approach will be almost impossible.

Second, when reviewing the current data science process, it became clear that there is no structured way of conducting user feedback to get useful outcomes baked into the cycle. Getting high-quality user feedback on any work item is hard. It requires knowledge and a meticulous process.

Design has been going through this process for many years now. They have defined standard artifacts and methodology that easily work for evaluative research efforts. In fact, design disciplines like design research and UX design have been an outcome of this shift in working.

So far, this has never been a requirement for data science. Data scientists do not create their artifacts with user evaluation in mind. Their main purpose is to solve a tech problem and be understood by a tech audience.

Our second assumption, that the data scientist should handle the data science cycle — and therefore needs to be able to conduct user evaluation — can only be seen as partly valid. It is true that, traditionally, data scientists own the data science cycle. However, this does not mean that they have to be the ones owning the user evaluation efforts.

Given the existing workload of data scientists, incorporating user evaluation into their schedule seems unfeasible without removing any other tasks.

We need to consider the necessity of introducing another role to the data science cycle with the specific task of collecting user feedback, which can subsequently be utilized by data scientist to enhance their artifacts.

What next?

Our survey and follow-up interviews have proven that the problem of the data science cycle lacking sufficient user feedback is accurate.

We also now know that the solution to the problem isn’t some product or education that will magically make the data science cycle more user-centered. We believe it requires a different way of thinking about and structuring AI projects:

  1. PMs need to develop a systematic approach to user research and testing.
    Gathering user feedback has to be planned and built into the process from the start. This falls on project and product managers and leaders who own the AI lifecycle.
  2. We need to add other disciplines into the cycle
    Within the AI lifecycle, the data science cycle (CRISP-DM) is almost solely owned by data scientists who do not have the time or training for user research. A new role needs to be introduced to own user research. Ideally, this would be a design researcher or UX designer, but even a data scientist with user research training or experience could fill the gap.
  3. Data scientists need to create artifacts that can be tested
    Within the design process, such artifacts include user flow charts and wireframes that can be shown to users to get feedback. The data science process does not currently contain artifacts that can be used with users to gather feedback. Until now, validating with end users has not been a focus within the cycle.

Apart from the lessons we learned about integrating user feedback into the data science cycle, we also learned a lot about our design process:

  • Believe in the process, but also trust your gut feeling. Not all methodologies fit all problems. We thought that a design sprint would help us to speed up our work and it turned out to not be the best approach for this project.
  • Every round of the EDT loop represents a level of fidelity. It’s okay to start with the solution. However, you should research and reflect before moving to your prototype’s next level of fidelity.
  • Don’t be afraid to show your unfinished work; be open to receiving candid feedback. We need to be hit with reality occasionally to keep improving as professionals.

We are interested in bringing forward a solution to the issue and are interested in other perspectives. Please leave us a comment, or engage with us on LinkedIn:
Kinga Parrott
Laura Dohle
Maria Sánchez

Laura Dohle is a Design Lead at IBM based in Boeblingen. The above article is personal and does not necessarily represent IBM’s positions, strategies or opinions.

--

--

Laura Dohle
IBM Design

Advisory User Experience Designer @IBM. Designer by qualification, data scientist by passion. Opinions are my own.