Designing Human-Centered AI tools in Google Flights
When to book and when to fly? UX case study about explaining flight prices with AI.
Though much of the world is currently in lock-down and air travel is drastically reduced, the questions this post addresses are as relevant now as they will be once the people of the world again feel they can travel safely to be in one another’s company.
Imagine the scene: you’re living your normal busy life, doing the day-to-day grind, and you realize you haven’t booked that flight for that trip you’ve got to take. Oh right, you may think, the holidays are coming up, I still need to book my flights. I bet flights cost a fortune already. We can relate. In fact, millions of people around the world can relate — flight pricing is tough to anticipate, including for many Google Flights users. Flight prices are subject to changes, inconsistent across sites, and hard to understand. As this video from CNN says, “It’s rocket science.”
Identifying patterns in flight prices is tricky. They change a lot. According to some popular blogs, even on a single plane, a type of seat that sells for 100 dollars to one user can easily sell for 500 dollars to another user.
We at Google Flights thought that if we can put some of the data and smartness in the hands of our users, that could help them demystify what they’ll need to pay at a certain time for a certain flight. We hoped we might save our users time, stress, and maybe some money too!
Our first stop was the Explainability + Trust chapter of PAIR’s People + AI Guidebook which provides a useful framework for thinking about these sorts of issues. Here’s what we came up with when we applied it to our unique challenges.
The user need
In user study after user study, the Google Flights team kept hearing the same message: buying a plane ticket is nothing like buying a cappuccino. When searching for flights to new destinations, it’s difficult to estimate the fairness of a price, not to mention that prices can jump up or down in unpredictable ways, sometimes doubling or tripling in a matter of hours. And then you have to figure out the fee for checking your bags.
It’s no surprise that flight shopping can be extremely stressful and emotionally taxing. We found that Google Flights users could fall into one of the following categories:
- The Gambler. These travelers are trying to game the flight pricing system to get a good deal. They find any kind of price insights in the Flights interface helpful, but might also bet against the predictions.
- The Worrier. These travelers are much more concerned about getting a fair price, especially when they travel to a new place they have never visited before. They find price tips reassuring but are generally anxious about buying.
- The Dreamer. These travelers enjoy browsing flights to new and unfamiliar destinations, but it takes a lot of heartfelt commitment to take the leap and book a trip. These users might book if they could recognize a great deal when they see one.
Of course, there are many more user types. But we found that almost all users compare prices across multiple sites trying to understand what influences flight pricing. We also found that users will frequently wait for as long as possible before booking, waiting until they develop a “gut feeling” for what is a good price for a given flight. This process could take anywhere from a couple of days to several months.
How we approached this with AI
Many AI products prioritize information over user needs. This approach typically leads to products that can give users specific insights into the intricate workings of ML models, showing confidence intervals and training data sources, but frequently run the risk of overloading the user with too much information. But does the user require all that additional information, or is the user just trying to figure out whether now is a good time to book or not?
The engineering team built a model of flight prices to help users understand if the prices they’re looking at are high, low, or average. Our team also wanted to balance details and actionability by linking the details of our AI prediction with a specific action that users could take as the next step: should they go ahead and book or try changing some of the parameters of their search to find cheaper flights?
The technology problem
The issue with predictions, and a common theme across AI-driven products, is that machine learning predictions can’t be 100% right all the time. After all, learning, by machines or humans, can’t happen without making mistakes. First, the predictions are specific to certain flights to certain places at specific times. Second, for some places, we don’t have enough pricing data to provide an accurate prediction of whether the price is fair or not.
So we knew it was important to help users make informed and better decisions by explaining where this data was coming from and what it relates to. We needed to allow users to:
- Assess price ‘goodness’ today and in the future
- Keep track of the model’s predictions and check them
- Make confident decisions about when to book
While also making sure that they:
- Understand where our data is coming from
- Get a feel for the general trends in flight pricing
- Have reasonable expectations for the correctness of our predictions
The design problem
So, we started designing a new tool to help users understand whether the prices for a given flight are currently high, low, or typical, and help users learn market trends for similar trips. The team also developed a prediction model for how a price might change in the future. In order to make this information clear to both frequent fliers and twice-a-year travelers we needed to sort out how people decide to book.
At one point, we thought we should pivot to a more directive approach, hiding the complicated calculations we were doing in the background, and just giving the simple conclusion, such as, “Today is a good day to book.” But when we tested this approach in our research, users expressed that this was salesy, upsetting, and not trustworthy.
That was a no-go for us, because Google Flights only works because people trust it. So we knew this wasn’t the right balance of information and actionability. To start setting some guide rails for further iterations, our team came up with three design principles for price intelligence in Flights. Any price insights we surfaced to users would have to be:
- Concise, yet explorable
There are two things that we were explaining to the user in this interface:
- Whether the price was high, medium, or low
- How confident we were that prices might change in the near future.
So how could we explain the ML model output in a way that is actionable and compelling, but also accurate?
We used multiple design elements to explain the price insight and foster trust.
- A price ‘goodness’ indicator, with the corresponding descriptions of ‘high’, ‘typical’, or ‘low’.
- A single-line explanation of the usual price for a trip like the one the user is planning.
- Prediction text, saying whether prices are likely to go up or not go down.
- An info icon that opens an explanation bubble with text explaining what data sources were used to compute the insight.
The wording of the text we used had a significant impact on user comprehension during our research studies. This is why we focused it down to make it as compelling and actionable as possible, while still allowing users to explore the data backing it up. Using progressive disclosure with the info icon was extremely helpful, but so was iterating on what words we used, whether or not we would show confidence, and if so, how.
Model confidence displays
Unintended consequences of radical transparency
At first, we attempted to show the likelihood that a price would go up or down in a very specific way. We tested out wording similar to the following: “Prices are unlikely to drop and there’s a 75% chance they’ll increase by $17 in the next 5 days.” This is what we might call radical transparency — that’s a lot of information for a user to process before making a decision. But we felt it was better to tell them everything and let them make the call on their own.
Turns out, people are very optimistic about their chances. Even when you say “85% likely to go up in the next 2 days” some users interpreted this as “15% likely to go down eventually,” and they liked those odds. Other times users wouldn’t do the math, or couldn’t understand it, so they substituted a simpler question in their mind. They asked themselves: “Can I afford to spend $50 on putting off booking for 3 days?” If yes, then they would hesitate to book.
What we learned
We decided not to show confidence as percentages because people either didn’t read them or didn’t understand them for flight prices. We decided if we aren’t super confident — 90% or higher — we wouldn’t show a prediction at all. “Medium” confidence predictions were confusing and not actionable. When we were confident in the prediction, we used much simpler wording: “likely to go up” or “not likely to go down”.
To make sure our price insights were communicating the right message, we decided to complement each price insight with an additional price history data visualization. You can read more about graphic-based indications of certainty in the Explainability + Trust chapter of the People + AI Guidebook.
Now for some flights we show users how the price has changed over the past few months and notify them when we predict that prices may go up soon or won’t get any lower. This feature performed really well in usability tests so we launched it to the public in August 2019. When people saw the price history graph and expressed satisfaction in what they were seeing, we knew we were on the right track.
On top of this, we wanted to signal to users that, based on our predictions, we were confident that they were booking at the lowest possible price. In late summer 2019, we piloted a price guarantee program in the US that showed a badge on flights when we were very confident the price wouldn’t drop any further — the cheapest available deal. After the user books, we kept tracking the price of the flight, and if it dropped against our original prediction, we paid them back the difference. In user research we found that the feature gave users more confidence to book, making flight shopping a less stressful experience overall.
With price guarantees, we were able to align feedback with model improvement and give users important signals about our prediction confidence. You can read more about this in the Feedback + Control chapter of the People + AI Guidebook.
You know how flight prices are unpredictable? We’re trying to fix that by giving users a better understanding of what happened to these prices in the past. We found that these three strategies were most helpful in the design of price insights in Flights:
- Articulate data sources: Telling the user what data are being used in the AI’s prediction helped our product team avoid contextual surprises and privacy suspicion, and helped the user know when to apply their own judgment.
- Experiment with different confidence indicators: Showing model confidence in categorical buckets and visual graphs helped us to give users relevant information about flight prices in a way that was easy for them to understand.
- Account for unexpected user behaviours: Conducting user research early and frequently helped us anticipate any unintended consequences of detailed explanations, helping the product team to change our communications approach and bolster user trust.
Building ML products is hard. But sometimes the hardest part is communicating what your ML does to the user in a way that is both accessible and useful. If and how you offer explanations of the inner workings of your ML system can profoundly influence the user’s trust in your system and the ML’s usefulness in their decision-making. This applies to flight shopping as much as it does for any other human activity.
Editor: David Weinberger, Writer-in-Residence at Google
Originally written for Google’s People+AI Guidebook Medium Channel on May 22, 2020.