Booking Deduplication: How Agoda Manages Duplicate Bookings Across Multiple Data Centers (Part 2)

Agoda Engineering
Agoda Engineering & Design
6 min readJul 17, 2024

by Audchadaporn Lertchanvit

In Part 1, we introduced the challenge of unwanted duplicate bookings at Agoda caused by system incidents and delays in booking confirmations. We discussed the implementation of the booking deduplication feature to prevent such duplicates. The challenges of operating with multiple data centers and scaling the deduplication service for different products were highlighted.

We explained the architecture involving centralized data storage and redundancy measures, as well as the introduction of a generic table schema for all products. The first steps of the booking deduplication process, including retrieving and combining duplication candidates and processing the candidates’ list until reaching a conclusion, were also covered.

In this part, we will discuss serving latency-sensitive cases, further development, and applications of the deduplication mechanism.

Serving Latency-Sensitive Cases

As detailed in the “booking creation process” section, the creation process begins by checking if the request is a duplicate. If it is not, the process proceeds to create the booking. Once a booking is successfully created, it is inserted into the candidates’ table as a new candidate.

Although it seems logical to insert a new candidate only after a successful booking, this approach does not always catch duplicates effectively, especially for requests sent almost simultaneously. The time from receiving a request to inserting a new candidate can take tens of milliseconds (ms) — let’s assume it takes 30 ms. If two duplicate requests arrive within less than 30 ms of each other, the Booking Deduplication Service will fail to detect the duplicate in the second request because the candidates’ table will not yet include the candidate record from the first request. To detect such cases, the candidate record from the first request needs to be inserted into the table as early as possible.

Further Development

To serve time-sensitive use cases requiring sub-second latency, the candidate insertion step was moved to the very beginning of the process. Now, the journey of booking creation starts by checking for duplication and inserting a candidate right away. Subsequently, it proceeds to create the booking. If the booking creation process fails, the status of the candidate record is updated to inactive. Otherwise, the status remains active for future deduplication. To handle any data missing due to database failure, updating the status of a candidate record is done through upsertion.

Diagram showing the improved booking creation process

Changes in Further Development

To achieve the new design, we adopted the SQL Unique Key feature by defining a unique key constraint on a combination of the candidate_hash, candidate_json, and booking_id columns. Additionally, the transaction isolation level in the stored procedure was set to “read uncommitted” to allow dirty reads. With these settings, any transaction attempting to insert a duplicate will fail, even if the prior transaction has not been committed yet.

The main changes to this enhancement are in Step 1 of the Booking Deduplication Service.

Step 1 of booking deduplication service in the further development.

Step 1: Check duplication, insert a new candidate, retrieve duplication candidates from local and central DBs, and then combine the results.

This step consists of two sub-steps as follows:

1.1 Check duplication, insert a new candidate, and retrieve duplication candidates from both local and central DBs.

Some booking context will be extracted from the Make-a-booking request based on the product type and used to generate a list of BookingContextFromRequest objects. Each object contains a booking hash, booking json, and booking status fields.

Next, the service then makes stored procedure (SP) calls to both local and central databases. The SP has become more complex; previously, it only queried duplication candidates. Now, it also checks for duplication and inserts a new candidate. Additionally, the SP returns an integer named SPResult. SPResult is the result from the SP indicating the next steps for the booking creation process: either to reject the request or continue with the booking creation. The possible values are:

  • -1: The request is a duplicate booking.
  • 1: Continue with the booking creation.

If any duplication candidates match the input booking hashes, the SP returns the candidate records.

There are three possible outcomes from the stored procedure:

  • The list of duplication candidates is empty, and SPResult is 1.
  • The list of duplication candidates is not empty, and SPResult is 1.
  • The SPResult is -1. This occurs when a transaction attempts to insert a candidate but fails because another transaction has already inserted that candidate (a race condition). The insertion of the duplication candidate is blocked by the SQL unique key constraint.

1.2 Combine the results from the two databases.

Combining the lists of duplication candidates returned from the two databases follows the same process as in the previous design.

For SPResults, if the values from the two databases are equal, the combined SPResult will be that matched value. If the values differ, the combined SPResult will be -1, and the booking deduplication service will also upsert the candidate status to inactive.

There are three possible combined results:

Combined Result #1: The list of duplication candidates is empty, and the SPResult is 1.

  • This indicates that the request is not a duplicate booking. The SPResult of 1 tells the booking creation process to continue. This combination also indicates that the SP successfully inserted a new candidate, so the booking creation process will proceed.

Combined Result #2: The list of duplication candidates is not empty, and the SPResult is 1.

  • The list of duplication candidates is not empty, suggesting a potential duplicate booking request. The SPResult of 1 directs the booking creation process to continue. The booking creation process will proceed with steps 2 to 4, filtering the candidates’ list and reaching a conclusion.

Combined Result #3: The SPResult is -1, regardless of the list of duplication candidates.

  • This confirms that the request is a duplicate booking. The booking creation process will immediately reject the request, regardless of whether the list of duplication candidates is empty. A pop-up asking, “Is this a duplicate booking?” will be shown to the customer.

Further Applications

At Agoda, we have applied the same mechanism from our further development — utilizing the unique key constraint feature and the isolation level setting in SQL databases — to other deduplication features, such as API request deduplication and message deduplication. While the table structures, architectures, and workflows differ significantly between these features, the fundamental approach has proven effective. Despite the variations, our sample applications have shown satisfactory results in deduplication behavior.

Conclusion

The booking deduplication feature was introduced many years ago to reduce duplicate bookings in our system. However, as Agoda has grown rapidly, new challenges have emerged that the legacy deduplication system can no longer address. The primary challenges are serving multi-data centers and ensuring scalability when adding new products.

To support multi-data centers, we implemented centralized data storage and connected booking deduplication services to both localized and centralized data storages. For scalability, we designed a unique table schema that consolidates all products, making it effortless to add new ones.

To address the challenge of serving time-sensitive use cases requiring sub-second latency, we proposed a design enhancement involving the unique key constraint feature and isolation level settings in SQL databases. Sample applications utilizing these SQL enhancements have shown remarkable results in deduplication process.

The designs in this article are based on Agoda’s infrastructure, which historically uses SQL servers. We still had to rely on them. However, some new SQL servers and features might provide similar solutions out of the box to our challenges.

--

--

Agoda Engineering
Agoda Engineering & Design

Learn more about how we build products at Agoda and what is being done under the hood to provide users with a seamless experience at agoda.com.