Travel Bots: Are we there yet?

Conversational user interfaces are becoming increasingly popular in travel, but do customers actually use them? In this report we explore the state of chatbots and their effects on mobile booking behaviour.

Download the complete 16-page whitepaper here

Booking travel on mobile devices

Booking travel can be one of the most tedious tasks to execute on a smartphone. There are countless configuration options to choose from: hundreds of flights, hundreds of hotels, hundreds of car rental options, and so on. In the context of business travel, this problem gets even more serious.

There are additional factors for business travellers; Factors like corporate travel policies, whether a flight is refundable, and other restrictions. In addition to company specific constraints, some business travellers might also have their own personal preferences for travel: some might prefer window over aisle seats, or seek out specific hotels that offer loyalty program points.

Given the complex nature of booking travel online, integrating all possible options into mobile apps or mobile versions of websites hasn’t been overly successful. In a recent survey, we asked participants how they usually booked their travel. It was reported that 22 out of the 28 participants had used desktop versions of travel booking websites, while only 15 had used a mobile booking option.

The remaining bookings were completed via a travel agent, or other services. These results seem to indicate that people still prefer not to take the perceived risk of booking their travel via smartphones and prefer to use the desktop version of websites instead. Based on a report published by Google in 2012, 47 percent of users’ daily media interactions happen on handheld devices. However, planning trips was categorized as a “more complex” activity which is often initiated on a PC or laptop (more than any other type of activity).

Chatbots to the rescue?

The landscape has started to shift with the introduction of chatbots. Chatbots are applications that simulate having a conversation with another human. With chatbots, instead of going to a website and browsing options in different menus and filters, one can have a natural conversation with a robot to complete booking. The robot is powered by a set of rules, and is presented to the user via a familiar chat interface similar to text messaging. In a perfect world, the chatbots are so well-implemented that users are unable to distinguish a conversation with them from that with an actual human. A user specifies their preferences and limitations, and the bot will find the best match based on the presented constraints. Ideally, the user will not have to do any extra searching, or dig into numerous distracting options to find their perfect booking.

However, there is still a long way to go before travel bots have widespread adoption. In the aforementioned survey, only 1 out of 28 respondents reported using a travel bot to book a trip. That being said, since they are comparatively newer products, it is not perfectly clear if users prefer them over similar apps and websites. In addition, we still do not have a clear understanding of 1) how people interact with bots, 2) what they are looking for in travel bots and 3) the potential areas for improvement.

To answer these questions, we designed and ran a usability study to observe user behaviours while using chatbots, and compared user experience with bots to the mobile version of a travel booking website. In particular, we conducted this study with the following goals in mind:

  • Conversational User Interfaces have become very trendy recently. We wanted to assess the effectiveness of these interfaces before digging deep into implementing them for commercial use.
  • We were interested in figuring out whether chatbots perform better than their counterparts (mobile versions of travel booking websites) in terms of 1) total time of booking process, 2) interaction pain points.
  • We were interested in observing how people interact with chatbots linguistically.
  • Finally, we wanted to figure out the conditions and scenarios in which one interface might perform better than the other. Ultimately, these observations can help us to come up with a set of design implications for future product development.

How we designed the study

We designed a study where participants used two different interfaces to arrange travel bookings in different scenarios. To achieve a clear understanding of behaviours on different interfaces, we came up with 3 types of scenarios and tasks for this study:

  1. Booking a simple, round-trip flight on both interfaces as a warm-up task.
  2. Booking a business trip with more restrictions on the flight options.
  3. Booking a personal vacation consisting of a flight and a hotel booking.

We arrived at this three-category schema of scenarios to mimic the types of possible bookings for an individual, while keeping in mind the restrictions within our interfaces.

We studied two types of interfaces: chatbots and mobile versions of travel booking websites. After careful consideration of numerous products, we picked “HelloGBye” iOS application as our target chatbot. HelloGBye is an AI-controlled chatbot that allows users to book flights and hotels by means of a conversational interface. We chose this application primarily because it has no human involvement in the booking process and performs adequately in terms of understanding users’ queries. For the mobile website, we picked Air Canada’s website which offers the same options (booking flights and hotels). Participants interacted with both interfaces via an iPhone 7 mobile device.

We designed a within-subject study wherein each participant completed the tasks using both interfaces. All participants began by working through the first task on both interfaces as a training task. They then executed the second and third tasks either on the mobile website or the travel bot. The order in which they used the interfaces was counterbalanced using a 2x2 latin square across participants to eliminate the learning effect.

Our participants

We recruited eight participants (2 female, 6 male) through ads on social media and email lists. Participants’ age ranged from 20 to 37 (median: 31) and had different occupations including: Computer Science student, editor, arts and culture consultant, paramedic, electronics engineer, graphic designer and mechanical engineer. Five participants reported traveling six to ten times per year, while the other three reported traveling two to five times per year. Five participants had booked trips via handheld devices and four had used conversational interfaces before (all with the intent of initiating technical assistance, i.e. on internet providers’ websites).

The usability study session setup. We recorded participants’ interactions with the interfaces via Mr. Tappy camera.

Here’s what we found

The purpose of this study was not to define a clear winner for travel booking scenarios. Rather, we tried to look at the pros and cons of chatbots, compare them with the mobile version of a website, and attempt to derive some design implications from our observations. Although there are existing design guidelines for conversational user interfaces (e.g., by IBM), we were interested in validating those guidelines (particularly within the context of travel chatbots) by running a usability study. Overall, we found that people normally do not trust chatbots, probably because of the poor experience they have had with chatbots in the past. However, there is great potential for further development of chatbots. In the following sections, we review some of the key findings from our usability study. Later, we will introduce some design implications and food for thought in terms of further development in this area.

  1. People do not trust chatbots… yet.

The highlight of our study was the conclusion that people generally do not trust bots. They expressed this either explicitly in their post-study comments or indirectly while doing the tasks. The lack of trust is likely due to users’ previous experiences with other chatbots. Some reported having frustrating experiences with chatbots in the past, while others did not expect the bot to understand their queries at all. This was also reflected in the users’ moment of surprise when they could effectively communicate with the bot in some instances.

As an example, when participant 2 was asked about why he did not enter all the travel requirements into the conversational interface and instead chose to narrow down the results via the menus, he mentioned:

“I thought it would overload the chatbot, speaking from previous experience. I don’t trust chatbots. I don’t trust their command of language. I try to keep it as plain and straightforward as possible.”

In another instance, participant 4 wanted to book a business class flight for task 2. He was actually surprised that the bot understood his query:

“I want business class” [entering query]… “And I’m not expecting it to understand me though”… [bot understands the query and returns business class results]… “Now I’m impressed!”.

2. Users are upfront with bots.

All of the participants tried to enter all the required information, or all the information they expected the bot to understand, up front. As participant 4 pointed out:

With interacting with the chatbots, I’m just gonna give them the information that I know they need to know right off the bat… Instead of doing this in multiple texts”.

Then they filtered the search results via the menus and options in the following screens. In some instances, though, they were forced/prompted to go back and enter new queries. If the follow-up queries worked, they were surprised and impressed since it was an unexpected result. This point is illustrated in the following vignettes:

“It’s nice that the bot has found an option to Vancouver and from Vancouver to Calgary. Not just my last request.” — Participant 3.
“I think it was good that you didn’t have to restart every time. I think I managed twice to add more details and it fixed it so that worked. I didn’t think that would work but it did.” — Participant 6.

3. Bots should respond in a more human-like and personalized tone.

The HelloGBye chatbot is quite limited in responding with a personalized tone, thus failing IBM’s “Personality” design practice. While other chatbots are able to personalize their responses, either by calling the users’ name or providing personal suggestions, the HelloGBye bot only responds to queries with “OK. I’ve found these options”. It also does not clearly indicate errors or the queries it has not understood. Instead, it only returns results based on the last query it has understood.

4. Bots are not faster than websites.

Participants entered an average of 3 queries to accomplish each task using the travel bot (sd. 2.03, med. 2). However, as mentioned earlier, they did not enter follow up queries in any cases unless their initial query had failed, returned unexpected results, or they were prompted by the study coordinator to do so (i.e., they did not purposefully use queries to narrow down the search results).

On average, it took 499 seconds for the users to go through the first task (simple flight booking) via the travel bot and 198.875 seconds via the mobile website (60% faster).

Similarly, it took them 508.125 seconds on average to go through the second and third tasks (complicated flight and vacation bookings) via the travel bot, compared to 461.875 via the mobile website interface which is 10% faster.

Distribution of completion times across the two interfaces. It generally took the participants longer to complete the tasks via the chatbot interface.

Finally, participants rated the possibility of them using a travel bot in future on a likert scale of 1 (not likely) to 5 (very likely) as 3.75 (sd. 0.88, med. 4) on average, which does encourage further research and development in this area.

Looking to the future

Our study provides some of the first findings on the effectiveness of travel chatbots compared to mobile websites. As discussed earlier, there are certainly shortcomings with current travel bots. Due to the general lack of trust towards chatbots, even newer and more effectively executed travel bots will likely have a hard time attracting users. However, new advancements in Machine Learning and Natural Language Processing techniques can help shape a brighter future for chatbots.

To summarize our findings and guide the design of future chatbots, especially for travel booking scenarios, we outlined a set of design implications based on the observations from our study. These are in line with existing efforts by IBM in compiling a set of design guidelines for conversational user interfaces.

It is worth noting that we were interested in observing user interaction with chatbots (while letting the users know in advance that that they were interacting with bots and not humans), the challenges users faced during the interaction, and the workarounds they came up with to finish tasks. Further studies may investigate usability of other chatbots, compare them with human powered conversational interfaces, or study them in other contexts.

Using bots only to enter the initial query: Currently, in line with our findings, it appears that users generally use chatbots to enter all the information about their query up front. As a result, a conversational interface (either in form of text or speech) could possibly substitute or complement numerous form elements that are normally used as data input mechanisms. The transition from form elements to conversational interfaces will surely take some time, but we’ve already started to see some development with personal assistants such as Google Home, Apple Siri, and Amazon Echo.

Conduct research on Machine Learning techniques and wait for the right time: While there is great potential for travel bots down the road, it looks like it is still not the time to launch a chatbot-only app. Users do not trust bots yet and there are still no well-defined, globally-accepted design guidelines for bots. This has resulted in each bot following its own path in designing interactions which can cause unexpected outcomes and frustration for users.

One thing that is clear, though, is that chatbots will become increasingly more intelligent and more able to effectively understand users’ queries. To achieve this, we suggest that designers, developers, and researchers put more emphasis on research in Machine Learning and Natural Language Processing techniques. As competition rises between well-designed chatbots, those without a completely intelligent back-end will undoubtedly struggle to survive.

Personalize chatbots: One of the shortcomings of existing bots is that they often provide very generic responses to every user. This could lead users to believe that the bot is not personally responding to them, decreasing its legitimacy. To overcome this issue, we suggest that chatbots be designed in such a way that ensures:

  • That they ask about user preferences beforehand or “learn” preferences (e.g., window or aisle seat preferences, dietary restrictions, etc.) via Machine Learning algorithms and apply this knowledge in future responses.
  • That they provide answers in a more personalized, human-like tone instead of generic responses.

Download the complete 16-Page Research Report

Versett is a product design and engineering studio in New York and Calgary, Canada. If you like this post, you’d love working with us. Say hi at