Who are we?
Hello! We are 4 engineering interns who joined Airbnb during summer of 2015 to work on the Search Experience team. The Search team is responsible for everything related to finding and booking a home on Airbnb. Over the course of the summer we have worked on a variety of projects for the team, but our final project was focused on improving how guests digest and interact with reviews. It has been an incredible opportunity, and we’ve helped work on the designs, defined OKRs, and been given significant implementation tasks.
The four of us are: Maya Ebsworth (UPenn, B.S. Computer Science 2016), Keziah Plattner (Stanford, B.S./M.S. Computer Science 2016), Iain Nash (USC, B.S. Computer Science 2016), and Nicholas Moschopoulos (UC Berkeley, B.S. Computer Science 2016).
Why are we working on Reviews?
Reviews set the foundation of trust on Airbnb. Our data scientists have done research suggesting that reviews are among the most important features guests look at when deciding whether or not to stay at a certain home. Guests rely on reviews to make decisions and set expectations, but for homes with many reviews it becomes unmanageable to read through all those reviews and find the relevant information (our most popular home currently has 700 reviews). Because of this, reviews are not as helpful to guests as they could be. We believed that making reviews more accessible would help guests decide which home to book.
We identified that there were several ways to make reviews more useful:
- Allowing guests to search over reviews for a home
- Generating highlights of the best reviews for each home
- Allowing guests to vote on how helpful reviews are
1. Searching over Reviews
Iain Nash and Keziah Plattner
For listings with a large number of reviews, guests need to manually sift through page by page in order to find what they are looking for. For example, a traveler going to San Francisco may want to see whether it was easy to find parking near the home by reading reviews that mention “parking”. This becomes more and more difficult as the the number of reviews is increasing exponentially. Our solution was to allow guests to search through all reviews for a given home.
We use Elasticsearch for the review search backend, returning relevant review ids to our main Rails application. The Rails application fetches the full review data and associated objects required to render the reviews. We implemented a backend service to index and return results from Elasticsearch. The initial indexing step is done by loading the reviews database export from S3 and running a batch import to populate the Elasticsearch database.
In order to support updates, we needed the service to receive updates every time a review is inserted, deleted, and updated. Elasticsearch is designed to be near real-time search, so in order to make a review visible in the search service, all we have to do is add it to the index whenever a database update is made available. We used two internal services to support real-time updates: a pub-sub service for statically-typed event messages and a service that produces these events to be consumed. We set up the pipeline for the reviews search service to receive these updates, and updated Elasticsearch as needed.
2. Review Highlights
The aim of this experiment was to allow guests to see a short snippet of helpful reviews for a given home. In order to simplify the project, we decided to create a preliminary filter to identify sentences that were of high enough quality to display to guests. The initial approach was to filter using keywords that frequently appear in guest reviews. The set of keywords can easily be updated in the future.
We then applied a sentence scoring function on the filtered sentences to rank results. The sentence scoring function is where a lot of the magic is, and where a lot of the experimentation took place. We tried a variety of techniques including a sum of TF-IDF weights on the words in the sentence, a count of unigrams in the top 500 unigrams over the whole review corpus, and even just looking at the sentence length. We’ll be experimenting with launching variations of review highlights as the Search team improves clarity on the listing detail page.
It’s been awesome experience working on this project. I was given a lot of freedom to decide what to try out, and how to determine what works best. It also gave me, as an intern, exposure to and responsibility for a project end-to-end, starting with defining the scope and working through the architecture. Throughout the summer I was given a lot of help and mentorship, but it definitely let me see what it would be like to work as a full-time employee. Thanks Lu!
3. “Was this Helpful?” Voting
Maya Ebsworth and Keziah Plattner
For our third experiment, we wanted to see if we could help guests identify helpful reviews, and improve the sort order of reviews. We thought adding the ability to vote for reviews would allow guests identify reviews that they found helpful, and displaying the number of votes would allow other users to see what has helped other guests in the past. And once enough votes are collected, this would allow us to sort reviews based on their helpfulness value.
The backend for this feature is implemented with a general backend service framework developed in-house at Airbnb. This framework abstracts away boilerplate logic and encapsulates service logic in lightweight, reusable components called operators. The operators we implemented connect with the MySQL database to insert, delete and update votes on a review. In addition, the counts and user ids associated with a review are cached for efficient retrieval. Since we are loading upvotes for many reviews, we used MySQL+ Memcached for caching the results and improving performance.
The backend was implemented with generic voting capability, to allow for other types of voting in the future (downvoting, or categorizing a review as funny, etc). This will allow the backend to be very easily extended if the Search team decides to experiment with the voting capabilities in the future. Review helpfulness will be incorporated as signals into other relevance systems as Airbnb continues to personalize the guest experience.