Bedtime stories, AI chatbots and the future of UX writing

Why the story of AI has a happy ending

Chris Cameron
Booking.com — UX Writing

--

It’s the end of a long day. I’m tired. And the only thing standing between me and some much-needed relaxation is a grumpy toddler who won’t go to bed without a bedtime story. But not just any story…

“How about this one?” I ask, holding up one of the dozens of books from his overflowing bookshelf.

“No…” he replies.

“This one?”

“NOOO!”

I get it. He’s tired of the same old books. So I have an idea. I pull out my phone and open a fancy new app that everyone’s been talking about.

“What’s that?” he asks.

“This is my Story Robot!” I tell him. “Tell me what story you want to hear, and the Story Robot will write it.”

And just like that, my son and I had started an adventure, happily creating unique, personalised bedtime stories every night with ChatGPT. But it wasn’t smooth sailing right away.

He wanted to hear about his favourite heroes doing everyday things. Think ‘Spiderman goes to the zoo.’ So I asked ChatGPT:

Write a story about Spiderman at the zoo.

But I quickly realised these weren’t ideal instructions for an AI chatbot. It successfully generated a story about Spiderman at the zoo, but it was too long (even I was yawning), had words my son didn’t know and was a bit too scary for a 3-year-old.

So I tried again.

Write a short bedtime story for a 3-year-old about Spiderman at the zoo.

And this time, ChatGPT pretty much nailed it. I just needed to make one final improvement to really seal the deal.

Write a short bedtime story for a 3-year-old about Spiderman at the zoo,
and end it with Spiderman going home and going to sleep.

Perfection. Sweet dreams, kiddo.

From storytime to prime time

Not long after I (and the rest of the world) discovered the power of ChatGPT, back in my work life, I was invited to join a task force to build Booking.com’s first customer-facing GenAI product.

The idea was to add a ChatGPT-style GenAI chatbot into our mobile app to help travellers discover and plan their next trip — a mashup of AI’s conversational capabilities and Booking.com’s travel expertise.

I had a history with chatbots at Booking.com, and I literally couldn’t stop talking about GenAI to anyone who would listen, so I was thrilled to work on a project like this. As things got started, I eagerly collaborated with several colleagues from design, research, product, engineering, marketing and legal.

The project swiftly came to life thanks to our scalable design system and our existing machine learning models, and in June 2023, we proudly launched the AI Trip Planner in the Booking.com app.

The AI Trip Planner allows Booking.com app users to discover and plan their next trip in a chat interface.

Throughout the fast-paced development of the AI Trip Planner, I realised that working on a GenAI product as a UX writer felt familiar in some ways. But in many others, it was a complete departure from our usual way of working.

While AI can cause concern for those worried that robots are coming for their jobs, I emerged from this experience more excited about the future of the UX writing role than ever before. Here’s why.

Familiar territory

For writers at Booking.com, working on a chatbot is not entirely unfamiliar to us. We’ve built multiple chat interfaces in the past and worked closely with our Machine Learning teams, including on projects with Natural Language Processing.

For the AI Trip Planner, our UX writing process retained much of its core. Here’s what was familiar to us.

  • Working closely with design: As we do almost every day, we worked closely alongside UX designers to contribute to creating user flows, wireframes, prototypes, and final designs for the various needs of the AI Trip Planner.
  • Creating UI text: We also performed the standard writing portion of our role by crafting UI copy for buttons, entry-point banners, legal disclaimers, error messages and more. And importantly, as usual, we could still structure and organise it for a smooth handoff to our engineers.
  • Scripting conversation templates: Our history with conversational interfaces came in handy and informed our decisions when crafting a handful of scripted or “template” messages for the AI Trip Planner. Ultimately, not everything it says is generated by AI, especially in sensitive or risky scenarios (more on that later).
  • Diving into user research: We, of course, got hands-on with user research when testing began. We helped craft research questions and reviewed interview recordings to extract takeaways and improvement opportunities, which are always valuable in the early stages.
  • Collaboration with marketing: We also worked with colleagues in marketing to align and contribute to essential steps in the product launch, including product naming, key messaging and value proposition, internal communications and more.
Examples of UX writing artefacts from the early development stages of the AI Trip Planner.

This is a good example of our normal UX writing process. In most product work, this is where the journey circles back to the beginning, and we do it all over again with new learnings and solutions.

Even without all of the AI-related newness to come, it was reassuring to affirm at this point in the development process that AI products will still need all of the skills and expertise a UX writer brings to the table.

But, of course, that’s not where this story ends.

New frontiers

Many things about UX writing for the AI Trip Planner forced us to step out of our comfort zone, and to adapt and learn. Due to the nature of AI-generated content, we needed to change how we organised and structured our content and how closely we collaborated with engineers to refine the user experience. Let’s talk content first.

While we have experience with chatbots and conversation design, we were accustomed to pre-scripted decision trees with clear if-this-then-that rules. In those cases, we could carefully craft the exact words to account for any scenario — but that’s not possible when an AI language model generates a good chunk of the content.

We designed a new kind of conversation flow for the AI Trip Planner — one that didn’t provide exact messages for a strict flow, but instead one that outlined ways to handle different types of content in several different user scenarios.

A simplified flowchart of conversation outcomes with different content solutions for the AI Trip Planner — blue for human-written and green for AI-written.

In this flowchart, the blue bubbles are outcomes where we provide a pre-scripted template message that we can fully control. The green bubbles are where we have less control as AI generates the response.

While there appears to be more blue than green, these blue outcomes represent unsupported interactions, edge cases or fail states that AI can’t handle or, in some cases, that we do not want AI to answer.

For example, suppose a traveller asks the AI Trip Planner about a destination’s safety or visa requirements. In cases like this, we have to ensure that the traveller receives, or is directed towards, the correct information. So instead we provide a templated response advising them to check with local authorities. In this kind of sensitive scenario, given the limitations of GenAI in its current state, we can’t rely on it to provide the most up-to-date information.

Side note… the topic of security, safety and trust in GenAI products is worthy of a whole separate blogpost of its own!

The green bubbles are the ‘happy path’ where AI can provide the best answer. This flowchart is actually a great illustration of how more care and attention goes into the scenarios off the happy path and into the rocky crevasses of fail states and edge cases.

By designing this flow, we could visualise the guardrails for the conversation we needed to build to keep our users headed in the right direction or to provide them with a safe landing if they diverged.

An illustration of human-written (yellow, blue) and AI-written content (green) in an example AI Trip Planner conversation.

The image above shows how this distinction between human-written and AI-written content plays out in practice in the AI Trip Planner in the Booking.com app.

  • 🧑🏻 🟡 Yellow: These are your everyday bits of static UI text that we fully control. We call them ‘copy tags’ — also known as ‘strings’ or ‘keys’.
  • 🧑🏻 🔵 Blue: These messages, like the initial welcome message and the customer service response, are pre-scripted responses we have written for the AI Trip Planner.
  • 🤖 🟢 Green: For these messages, the AI Trip Planner takes the wheel (with our guidance).

How do we provide this guidance?

In some cases, we provide examples of how we want the AI Trip Planner to phrase a question or response. In others, we allow it more freedom to generate a response based on some basic instructions.

For example, the AI Trip Planner needs to ask a few questions to best recommend a destination to explore or a place for the traveller to stay. In the prompt code that tells it what to do, we provide a list of example questions that looks something like this:

Question list

- Destination questions:
- Which part of the world are you planning to explore?
- Do you know which region or country you’re planning to visit?

- Date questions:
- When do you plan to travel?
- For the best recommendations, can you tell me when you plan to take
this trip?

Then, the instructions tell the AI Trip Planner to ask a question like one from this list while wording it in its own way. This keeps the messages fresh while giving it clear boundaries.

How do we ensure the AI Trip Planner speaks like a Booking.com writer?

With AI generating so much content, it was important for writers to have input on the prompts that controlled the output. Along with the example questions mentioned above, the prompt code also includes several general instructions that the AI Trip Planner can apply to every scenario.

In these instructions (similar to the excerpt you see below), we give it an identity, tell it how to refer to itself, and pull in pieces of our writing guidelines to help it understand how we write.

- You are the Booking.com AI Trip Planner. You can help recommend
accommodations, destinations and attractions.

- Refer to yourself as the Booking.com AI Trip Planner and use the first
person “I” instead of “we”.

- Be helpful, friendly, conversational, relevant, natural, simple,
familiar, clear, inclusive, succinct, positive and optimistic.

- Find a balance between being too formal and too informal.

What’s great about the prompts behind a GenAI chatbot is that the ‘code’ is simply plain text English that is naturally understood. So, as writers, even if we lack programming experience, we already have the skills needed to help make improvements at this fundamental level.

Let’s talk more about that!

Adventures in prompt e̶n̶g̶i̶n̶e̶e̶r̶i̶n̶g̶ writing

Prompts are the lifeblood of conversational GenAI experiences. What was once the exclusive realm of complex coding languages is now controlled mainly by natural written language.

UX writers are already experts at expressing complicated concepts in a way that is concise and easily understood, and we know the impact the right words can have. This makes us the perfect engineering counterpart for refining the user experience of GenAI products.

At Booking.com, UX writers already work with engineers on technical writing topics, like ensuring our copy is localised for a global audience and that any dynamic content is rendered correctly. But with the AI Trip Planner, we realised pretty early on that our traditional methods for delivery and communication with engineers weren’t going to produce the desired results.

Considering that the prompts were where the magic happened, it was only natural we jumped in and got our hands dirty side-by-side with our engineers. As a result, we witnessed the impact of our writing in real-time. Here’s how.

How to fix a faulty robot with words

As I mentioned earlier, the AI Trip Planner needs to ask a few questions to best recommend either a destination to explore or a place for the traveller to stay. But one issue we saw early on was that it would repeatedly ask some of the same questions even if the user had already provided that information.

To solve this, we sat with the engineer and looked at the part of the prompts with the instructions for asking questions, and we found this:

- For the selected topic, ask ONE question (in a friendly way) from the
above Question List that was never asked.
- Refrain from summarize the conversation. Just ask ONE question.

At first glance (grammar issues aside), it’s not that bad. However, a lot of information is compressed into the instructions: the number of questions to ask, how to ask the question, where to get the question from, conditions for which question to choose, etc.

It may be possible for a human to decipher, but our testing showed that the AI Trip Planner couldn’t consistently follow all the instructions. By applying our existing writing skills to the prompts, we improved this set of instructions with more straightforward language and a more logical structure.

- For the selected topic, ask ONE question that has not been asked.
- Use the Question List as examples of the type of question to ask.
- Avoid using the examples word-for-word. Rephrase them into the
conversation in a natural and friendly way.

In this case, by simply rewriting a few lines of the prompts, we immediately saw that the AI Trip Planner could follow the instructions more consistently. It’s not as easy as providing instructions that we could follow — we had to learn how to speak to it so that it could also understand.

And that’s something UX writers excel at.

Through this process, we naturally learned a few tricks that helped us get the AI Trip Planner to do what we asked. Here are some ways our writing skills helped improve the prompts to create a better user experience.

Do unto AI as you would have it do unto you

Much like the Golden Rule, we learned that it’s important to provide instructions that match how you want the AI to speak. This is especially relevant when giving it examples to learn from.

Some of the first instructions written by our engineers used grammar, spelling, style and terminology that was inconsistent with how we wanted the AI Trip Planner to speak. Take, for instance, these example questions:

- when do you plan to travel?
- Do you travel alone or in a group or family or alone?
- What’s your hotel vibe?

Take your pick of the issues here. Missing capitalisation, repeating ‘or’ and ‘alone’ or just not really matching the tone we wanted. We were effectively telling it to do one thing, but not doing that ourselves.

We found this could confuse the AI Trip Planner and create inconsistencies in its output, so we gave it a quick edit.

- Do you know when you want to travel?
- Will you be travelling with your partner, family members, a group of
friends or will you be going alone?
- What kind of hotel are you interested in?

By rephrasing instructions and examples to fall in line with our desired style and tone, we eliminated the inconsistencies and found the output to be far better than before.

Accentuate the positive

Look… if you’ve come this far, you’ve probably already figured out that AI is a bit finicky in its current state.

While we’ve managed to keep the AI Trip Planner from having any wild hallucinations, we did find that sometimes it would just straight up do the opposite of what we said. For example:

- Do NOT refer to yourself or to Booking.com in the third person, e.g. "it"
or "they".
- Do NOT ask more than one question at a time.

As any parent with a young child can attest, instead of telling the AI Trip Planner what it can’t or shouldn’t do, we found it behaved much more consistently if we gave it positive instructions for what it can or should do.

- Refer to yourself as the Booking.com AI Trip Planner and use the first
person "I" instead of "we".
- Ask only one question at a time.

Dealing with ambiguity

At Booking.com, we often use the word ‘property’ to refer to a place to stay, like a hotel, apartment, resort, etc. But this word has multiple meanings, like ‘a person’s belongings’ or ‘the attributes or characteristics of a thing’.

We found this ambiguity was, at times, causing the AI Trip Planner to lose the correct context for a conversation. By replacing ‘property’ with a more specific term like ‘accommodation’ in the prompts, it was much less likely to lose context, and it continued the conversation as expected.

An exciting future for AI & UX writing

I hope it’s now clear why I came away from my experience of working on a GenAI product with great excitement for the future of the UX writing craft. But here it is in bullet form, just in case.

  • The established responsibilities and relationships of the UX writing role within product development have placed us in a prime position to continue to have a meaningful impact on GenAI products.
  • Our existing writing skills are highly transferable and crucial to producing high-quality GenAI experiences. The nature of prompts and our writing skills make us a natural fit for closer collaboration with engineers.
  • Bringing writers side-by-side with engineers to work on prompts is by far the most impactful way to instantly improve the user experience of GenAI products.

In GenAI product development, UX writers are indispensable, so the demand for skilled writers will only rise as more companies and teams begin building GenAI products.

No, AI won’t replace UX writers. It will empower us to do even more.

To hear more of my thoughts on the intersection of AI and UX writing, check out my appearance on the Content+AI Podcast with Larry Swanson.

Special thanks to Kelly Chambers, Hannah Weissenbuehler, Selena Wang, Erin Donohue and Graham Cookson for their contributions to this article.

The featured image above was of course generated with AI using DALL-E with the prompt ‘trendy-looking illustration of a father putting his toddler son to sleep in his bedroom, reading a bedtime story from his phone’.

--

--