5 Learnings on Data Science + User Research

Why Data Science and UX Research Teams are Better Together

Avantika M
7 min readSep 23, 2020



PSA: It is a key value for us at AIxDESIGN to open-source our work and research. The forced paywalls here have led us to stop using Medium so while you can still read the article below, future writings & resources will be published on other platforms. Learn more at aixdesign.co or come hang with us on any of our other channels. Hope to see you there 👋

The poster of the event

Today’s world generates data at unbelievably rapid rates. It is essential to leverage the available data to understand the bigger picture better. Data Science is changing the world, and user research needs to get on board to understand business and user needs better. User Experience can be a tool that can be essential to frame how data science conveys critical insights. This talk gives an overview of how Data Science can complement UX research, including quantitative and qualitative methods. It introduces the Data Science pipeline and describes useful UX research applications, like identifying users to interview, finding different customer segments, and generating data for usability studies.

This session is the first of our keynote series, and it took place over Zoom. The session was interactive yet educational. For this event, our speaker was Grishma Jena — Data Scientist with the Research Ops team for User Research and Design, Cloud & Data Platform, at IBM in San Francisco. She works across portfolios in conjunction with user research and design teams and uses data to understand users’ struggles and opportunities to enhance their experiences. Grishma kicked off the session by sharing her role at IBM and how we could get involved in their user research program.

The keynote covered relevant topics in Data science, such as data, Machine Learning models, and steps to be taken to incorporate data science into user research. We have summarized the key six takeaways from the keynote below:

1. Addressing Misconceptions

There are common misconceptions around data science and user research. Grishma addressed that the misconceptions are mainly related to practitioners in these fields. User researchers often think of data scientists as a glorified numbers person, and data scientists confuse user researchers for designers and may not be considered important to a product development process. She mapped out her motivation for the talk by sharing what data science can do in user research.

What data science and user research can do together
What data science and user research can do together

2. Asking the Right Questions

Grishma emphasized asking the right question, i.e., formulate a question the stakeholder is trying to answer. Questions like “Are we offering the right things to the right people?” or “How likely is it the user will buy our product?” aid in finding answers to the problem you’re trying to solve.

After you ask the right question and find corresponding data, the next step is data wrangling or data cleaning. Data wrangling is when you gather, select, and transform data for easy access and analysis. A couple of methods to do this is by scaling or normalizing data, deduplicating records, interpolating values, and standardizing multiple sources.

The next step is data exploration — the initial investigation of data to explore essential variables and how they are distributed. Grishma urges covering any initial patterns or points of interest. This helps form hypotheses about the defined problem.

3. Creating a Meaningful Model

The next step after data exploration is model building. She provided the following steps:

  1. Feature engineering: Select important features and construct more meaningful ones, using domain knowledge
  2. Preparation: Divide the data into training and test sets
  3. Training: Choose supervised or unsupervised learning, Tune model parameters and Monitor against overfitting
  4. Evaluation: Evaluate model on unseen data, i.e., a test set
Source: XKCD

One relevant use case to the topic is Airbnb. She provided them as an example of doing a tremendous job in integrated data science in user research. Here are a couple of ways Airbnb has managed to do so:

  • Used data to determine host preferences: Airbnb used supervised learning and classification to answer critical questions like “Would a host accept or decline a booking?” They also found trends about how hosts in small and big cities behaved based on demands and availability of properties.
  • Reduced bounce rates with a redesign: A data scientist at Airbnb discovered that some Asian countries' bounce rates are relatively high. To fix this issue, they approached UX researchers and decided to show top-selling destinations to ensure users stay on the site. This led to a redesign and a 10% increase in conversions. Here, they used supervised learning and regression.
  • Showed skewed search results to users: They let users drive the search based on past data (bookings) available to them. They created a model to assess the probability of booking and used it to skew search results. As a result, results showed properties that were most likely to be booked past on past booking trends.

Here is more information about how Airbnb uses Data Science.

Moreover, Grishma discussed model validation, where you assess your model’s quality, where you can use cross-validation for robustness or use metrics like accuracy, precision, recall, F1 score, or confusion matrix.

A good example is ABN AMRO: they want to help their customer service representatives tag better and faster based on the support ticket. They came up with a robust system with a high level of accuracy and precision in tagging/labeling. However, on deployment, they found that the representatives’ time to file support tickets didn’t decrease but instead increased.

Tagging of ABN AMRO ticket

This happened because the model landed up presenting the representatives with 20–30 different tags, which led to them spending more time going through the list of tags — this is a case where UX researchers could have helped data scientists produce better results by running some usability tests before implementation. The big takeaway from this is that while a system may look good on paper, it may not solve the problem at hand in person.

4. Telling a Story with Data

The last step of the data science pipeline is data visualization and storytelling. One can tell a story with data to answer the original questions, communicate findings to stakeholders, and humanize the numbers at hand. It uses a combination of narrative, visuals, and data to drive change. Narrative and visuals engage the users, visuals and data enlighten the user while narrative and data explain the data to the users. To learn more about data storytelling, check this link out.

Grishma presented us with Google’s outlook on usability testing as they believe that it is time-consuming and expensive. To solve this, Google AI used deep learning for usability testing to predict the tap-ability of elements.

Lastly, she then presented how Spotify uses data science and user research. Spotify is one of the few companies that have used the combination of Data Science and User Research to find peculiarities of users’ listening habits. In other words, find outliers and use it for their 2016 ad campaign such as the one shown below.

Spotify’s 2016 Ad Campaign

Here is more information about Spotify’s Simultaneous Triangulation.

5. Data Scientist + User Researcher = Dream Team

Grishma believes that Data Scientists and User Researchers are better together and are a dream team. As data scientists usually focus on quantitative data and User Researchers on qualitative data, they would bring different perspectives to the table. This will help in blending people and data to get close to the truth. Integrating these teams will provide a holistic understanding of multiple forms of data and mitigate the cons of a single research method alone. Furthermore, these teams will keep biases in check and find correlations that help develop a better hypothesis and hyper-specific personae.

“All data created by people. And all people create data…Today we divorce people from their data, and that gives companies a license to forget about the people behind the data…It allows us to divorce ourselves from the responsibility of what that data can do.”

– Ovetta Sampson, Microsoft

Embracing Fair Practices

The talk was concluded by addressing the most critical topic relevant to data and user research, which is Ethics. All involved in handling data should have an ethical discussion about the way the data is used. She discussed how tech could attack people or be misused by people that have used home assistant systems to gaslight their partners.

She urges to ensure that training data is fair and representative and understanding possible bias sources. It is crucial to ensure fairness over time, especially for the different user groups. It is essential to have a diverse team working in integrated teams to ensure that varied opinions, backgrounds, and thoughts come forward.

Thank you, Grishma, for sharing your insights on Data Science in User Research. You can watch the whole keynote here.

Here are some relevant linked provided by her that are relevant to this topic:

You can follow Grishma on Linkedin, GitHub, and Twitter. To learn more about IBM’s practices, do check out IBM Design Thinking and IBM User Research.

About AIxDesign

AIxDesign is a place to unite practitioners and evolve practices at the intersection of AI/ML and design. We are currently organizing monthly virtual events (like this one), sharing content, exploring collaborative projects, and developing fruitful partnerships.

To stay in the loop, follow us on Instagram, Linkedin, or subscribe to our monthly newsletter to capture it all in your inbox. You can now also find us at aixdesign.co.

