Hotel Website Flow Redesign — Part 3

This is 3rd in our series of posts on Hotel Website Flow Redesign. The previous posts are here:

  1. Hotel Website Flow Redesign — Part 1
  2. Hotel Website Flow Redesign — Part 2

While we designed a fluid and seamless flow, our engineers built the entire funnel on a new tech stack and consuming new APIs. Aligning to our previously stated goal of building a web flow that is performant and scalable, we wanted to address the following things through our new tech stack:

  • Search Results should be fast
  • Filters in action
  • Caching of our meta-data

Old Architecture
 The old architecture was not built for this and as mentioned it was just a hack :).

There were multiple issues with this:

  • Our Website and API were on the same project and not scalable if we had to provide this for other channels / platforms to consume
  • We were making calls over internet to partner API’s to fetch our search results which was slow
  • Iterations and updates used to take a while due to tight coupling of WEB and API
  • Whenever our Partner API’s changed their contract, it had a downstream effect and we had to make changes at our layers as well

New Architecture
 The new architecture was built keeping in mind the goals (Speed and Scale) we initially formulated

The broad ingredients of the new architecture were:

  • Website — Node.js application running on Ubuntu server with 4GB RAM
  • API — Web API, .NET application running on Windows IIS with 4GB RAM
  • NEO4J Graph Database — our cache layer running on Ubuntu server with 8GB RAM
  • Backbone — JavaScript MVC framework
  • Backbone-super-sync — https://github.com/artsy/backbone-super-sync
  • Built-in request caching
  • Auto handing of request time outs
  • Gulp — Earlier we were using Grunt for most of our minifying activities and compiling needs. Gulp has been widely adopted as the rigorous configuration files required in grunt is handled better here. We chose Gulp for these reasons:
  • Streaming and concatenation of configuration is much easier here
  • Now Grunt supports this but at that point of time it was not able
  • We also use Gulp for deployment using a plug-in called ‘diff’. The advantage here only the delta goes to the server and not the entire published code
  • Handlebar — Our view engine for node. The number of utilities available here are awesome compared to a lot of other systems. The templates created are easily shared across client and server. This is one of the biggest advantage when you would have a lot of SEO pages where server side rendering is required
  • Dust — Our view engine which supports streaming. One of the reasons for our search results to be faster is because of this. As we stream the data, we send this to HandleBar
  • Hemlet — gives a good wrapper around the HTTP Headers
  • Knex — Our SQL wrapper / ORM to interface with our MySQL database.
  • Isomorphic concept — This is an Isomorphic application coded in JavaScript that can run both on client and on the server. This way we can execute it in server to render SEO friendly static pages and the same set of scripts can be reused to build more user interactive experience in client side
  • Maps: Integration between Four Square and Map Box
  • Map Box — Used themes here. Wanted to spice it up
  • Four Square — Crowd sourced and has more information compared to Google Maps
  • Icons to Font — All the amenities in our SRP is now fonts. This reduces the overall size of download. We reduced it from 130KB to 11KB
  • PM2 — Our server monitoring
  • Recommendations — We also built a simple recommendation algorithm based on attribute matching. We basically build our recommendation cache every day and this gets served via an API. The recommendation cache is stored in REDIS

Google Page Speed Insights Report

More Details
 Recommendation Logic

We also built a simple recommendation algorithm based on attribute matching. We basically build our recommendation cache every day and this gets served via an API. The recommendation cache is stored in REDIS. Recommendation for each hotel is computed in broadly these steps:

  • Hotel details is fetched from Neo4j cache and are segregated city-wise
  • Each hotel is compared with every other hotel in the city and a score is computed by taking into consideration various attributes between them
  • Top 50 recommendations for each hotel is determined based on the score computed and stored in Redis cache.

We have a good recommendation score built for over 20000 hotels. We have plans to do more once we link to bus and personalization and capture more data as well.

Neo4J Cache
 Our caching framework not only needed to provide quick look ups but also query and join other data elements. The amount of data stored was also relatively huge and our expectation is it would grow big as we scale. Keeping this in mind, our initial set of caching frameworks we shortlisted were:

We hadn’t tried Neo4J earlier to this in any of our production systems. REDIS and Memcache is vastly used across various teams at redBus. With REDIS, we have done some really cool stuff, but our pain point with REDIS was always secondary indexes. Also, with a key-value store concept in REDIS, it is very hard to build relationships with KEYS alone. Looking at our Hotel Data that needed to be cached — at a broad level it encompassed:

  • Cities
  • DOJ
  • Hotel Information, amenities and other meta-data
  • Ratings, Image URLS and curated content

Neo4j fit the bill really well here as it solves these problems — secondary indexes and foreign-keys (called relationships in GraphDB). To understand more about Ne04j, please check here


Originally published at blog.redbus.in on January 11, 2016.