PriceAggregator: An Intelligent System for Hotel Price Fetching (Part 1)

Zhang Jiangwei
Agoda Engineering & Design
6 min readMay 19, 2020

How Agoda Manages Inventories?

Agoda is a global online travel agency for hotels, vacation rentals, flights and airport transfers. Millions of guests search for their accommodations and millions of accommodation providers list their properties in Agoda. Among these millions of properties listed in Agoda, many of their prices are fetched through third-party suppliers.

These third-party suppliers do not synchronize the hotel prices to Agoda. Every time, to get a hotel price from these suppliers, Agoda needs to make 1 HTTP request call to the supplier to fetch the corresponding hotel price. However, due to the sheer volume of the search requests received from users, it is impossible to forward every request to the supplier. Hence, a cache database which temporarily stores the hotel prices is built. For each hotel price received from the supplier, Agoda stores it into this cache database for some amount of time and evicts the price from the cache once it expires.

Figure 1. System flow of third party supplier hotel serving. If a cached price exists, Agoda first serves the cached price to the user. Otherwise, Agoda, on a best-efforts basis, sends a request to the supplier to fetch for hotel price and put it in the cache.

Figure 1 above abstracts the system flow. Every time a user searches a hotel in Agoda, Agoda first reads from the cache. If there is a hotel price for this search from the user in the cache, it is a ‘hit’ and we will serve the user with the cached price. Otherwise, it is a ‘miss’ and the user will not have the price for that hotel. For every ‘miss’, Agoda will send a request to the supplier to get the price for that hotel, and put the returned price into the cache. So that, the subsequent users can benefit from the cached price. However, every supplier limits the number of requests we can send at every second. Once we reach the limit, the subsequent messages will be ignored. Hence, this poses 4 challenges.

Challenge 1: Time-to-live (TTL) determination.

Figure 2. TTL vs. cache hit, QPS and price accuracy.

For a hotel price fetched from the supplier, how long should we put such hotel price in the cache before expiring them? We call this duration as time-to-live (TTL). The larger the TTL, the longer the hotel prices stay in the cache database.

As presented in Figure 2, the TTL plays three roles:

  1. Cache Hit. With a larger TTL, hotel prices are cached in the database for a longer period of time and hence, more hotel prices will remain in the database. When we receive a search from our users, there is a higher chance of getting a hit in the database. This enhances our ability to serve our users with more hotel prices from the third-party suppliers.
  2. QPS. As we have limited QPS to each supplier, a larger TTL allows more hotel prices to be cached in database. Instead of spending QPS on repeated queries, we can better utilize the QPS to serve a wider range of user requests.
  3. Price Accuracy. As the hotel prices from suppliers changes from time to time, a larger TTL means that the hotel prices in our cache database are more likely to be inaccurate. Hence, we will not be able to serve the users with the most updated hotel price.

There is a trade-off between cache hit and price accuracy. We need to choose the TTL that caters to both cache hit and price accuracy. To our best knowledge, most Online Travel Agents (OTA) typically pick a small TTL ranging from 15 minutes to 30 minutes. However, this is not optimal.

Challenge 2: Cross data center QPS management.

Agoda has several data centers globally to handle the user requests. For each supplier, we need to set a maximum number of QPS that each data center is allowed to send. However, each data center has its own traffic pattern.

Figure 3. Cross data center QPS management limitation. Data center A peaks around 50% QPS around 18:00 and data center B peaks around 50% QPS around 04:00.

Figure 3 presents an example of the number of QPS sent to a supplier from two data centers A and B. For data center A, it peaks around 50% QPS around 18:00. At the same time, data center B peaks around 50% QPS around 04:00. If we evenly distribute this 100% QPS to data center A and data center B, then we are not fully utilizing this 100% QPS. If we allocate more than 50% QPS to each data center, how can we make sure that data center A and data center B never exceed the 100% QPS in total? Note that, the impact of breaching the QPS limit could be catastrophic to the supplier, which might potentially bring down the supplier to be off-line.

Challenge 3: Single data center QPS utilization.

Figure 4. Un-utilized QPS

As mentioned in the previous section, each data center has its own traffic pattern, there are peak periods when we send the most of requests to the supplier, and non-peak period when we send much fewer number of requests to the supplier.

As demonstrated in Figure 4, for this data center, it sends <40% QPS to the supplier around 08:00. However, similar to the abovementioned example, 100%-40% = 60% QPS of this data centre is not utilized.

Challenge 4: Cache hit ceiling.

The passive system flow presented in Figure 1 has an intrinsic limitation to improve the cache hit. Note that, this design sends a request to supplier to fetch for price only if there is a miss. This is passive! Hence, a cache hit only happens if the same hotel search happened previously and the TTL is larger than the time difference between the current and previous hotel search.

Note that we cannot set TTL to be arbitrarily large as this will lower the price accuracy as explained in Challenge 1. As long as TTL of a specific search is not arbitrarily large, it will expire, and the next request of this search will be a miss. Even though we can set the TTL to arbitrarily large, those hotel searches that never happened before will always be miss. For example, if more than 20% of the requests are new hotel searches. Then, it is inevitable for us to have a <80% cache hit regardless of how large the TTL is set. This means >20% of the searches won’t be able to get hotel price from the supplier, which consequently results in booking loss.

To overcome the 4 challenges mentioned above, we propose PriceAggregator, an intelligent system for hotel price fetching.

PriceAggregator, the solution.

Figure 5. PriceAggregator system flow.

As presented in Figure 5, before every price is written to cache (Price DB), it always goes through a TTL service, which assigns different TTL for the different hotel searches. This TTL service is built on historical data extracted to optimize the trade-off between cache hit and price accuracy, which addresses the Challenge 1.

Apart from passively sending requests to supplier to fetch for hotel price, PriceAggregator re-invent the process by adding an aggressive service which pro-actively sends requests to supplier to fetch for hotel price on a constant QPS. By having a constant QPS, Challenge 2 and Challenge 3 can be addressed easily. Moreover, this aggressive service does not wait for a hotel search to appear before sending requests to supplier. Therefore, it can increase the cache hit and hence, addresses Challenge 4.

For more details about how PriceAggregator is implemented please check our next blog here.

Authors

Zhang Jiangwei, Li Zhang, Vigneshwaran Raveendran, Ziv Ben-Zuk, and Leonard Lu

Acknowledgement

Thanks Lin Jingru, Nikhil Fulzele and Akshesh Doshi for reviewing.

--

--