Helping people understand the positives and negatives of a property by summarising reviews with the help of machine learning
My name is Serena and I’ve been working in the UX writing field for over 4 years now. Only recently I’ve started collaborating with a Senior Data Scientist, Qiming. Being part of a team that builds data-driven products allows me to deepen my knowledge of machine learning and better understand how data can inform and influence the copy I write.
Our team is called UGC (User Generated Content) and we take care of all types of content created by people and shared on our website. UGC is pretty popular nowadays, thanks to the influence of social media in our life. For us, at Booking.com, UGC includes primarily ratings and reviews but we also have guest photos, for example.
UGC is a modern version of the good old word-of-mouth; people trust other people’s opinions more than any other type of advertising. So why is that? Because they perceive reviews and user-generated content as more informative, genuine and authentic — less biased. It’s a form of social proof. We rarely buy anything without first reading what previous verified guests had to say.
People want to be sure they are making the right choice when choosing a place to stay. They will read reviews and search for the topics they care about — not just the standard facts such as where the property is located and what it has to offer. Most people want to dig deeper: what was served for breakfast? Was the room quiet? Were the staff friendly? etc.
We also know that people want the information to be quick to read and easy to understand. Not everyone is willing to skim and filter through hundreds of reviews to find the information they need. Often you have to go through loads of irrelevant content too. It’s difficult to understand at a glance the positive and negative aspects of a property. There might be a very high overall rating, but certain topics mentioned in reviews (like cleanliness or breakfast) may be negative. People need to read through many reviews to form their own opinion about the property.
We decided to create a review summary, with an overview of the most discussed topics about the property. This will be shown first to the reader on the property page, then if they want to know more they can move on to the dedicated section and read all the reviews.
To create a clever and scalable review summary, we utilised text classification, a fundamental task in natural language process and machine learning. We annotated real guest reviews for different categories (location, breakfast, etc.) and their most commonly discussed topics. Then we trained a multi-class text classifier based on thousands of annotated reviews for each category. Later, we did a sentiment analysis to determine whether the topic discussed was positive, negative or neutral. Sentiment analysis is often used by businesses to monitor brand and product opinions, for example analysing reviews, survey responses and social media conversations.
My role as a UX writer was to generate sentences for each topic, both in positive and negative sentiment, starting from the raw data analysed. What I had available was a list of categories (for example comfort) and related topics (atmosphere, shower, temperature).
The aim was to write phrases that are useful and insightful but at the same time not repetitive and broad enough to include all the feedback written by guests for that particular topic. One mistake I made at the beginning was adding intensifiers such as “very” and “really” to some of my sentences. These are problematic because they strengthen the meaning and show emphasis, while my goal was to give all sentences an equally positive or negative sentiment.
Guests said rooms are v̶e̶r̶y comfortable / Some guests said the location was r̶e̶a̶l̶l̶y far from the centre
I read hundreds of real reviews to find the right words and use a natural and fluent English tone, but I’ve also created a clear brief with all the reference material to use the best terminology for every other language and culture.
Now that the machine learning model and the copy are created, we can finally match them! For each property with enough reviews, we can now understand what are the most relevant categories and discussed topics, and generate an automated summary with relevant information for all our guests. The end result will look more or less like this:
This overview allows us to understand at a glance the positive and negative aspects of a property and decide if we want to proceed and read reviews or book directly. Problem solved! This new feature was successful, we noticed people are now booking higher-rated properties. This means they are now making more informed decisions.
As you can see, this small part of a product seems easy and intuitive. Although, there is a lot of research, work, and collaboration done in the background. Machine learning is a powerful tool, able to add value to the users’ experience. But we need to be careful — even the best model can fail if the upfront process isn’t right.
Read also: Responsible UX writing for machine learning.
Adding machine learning to my UX writing process enables me to give users relevant information through a personalised content experience. The big data industry is rapidly evolving and I believe that the role of the UX writer will be increasingly connected to data and technology in the future. Do you agree?
We’re always on the hunt for new writing talent. Wanna join us? Apply here.