Image for post
Image for post

Design a proximity server like NearBy or Yelp Part — 1

Salil Arora
May 5, 2020 · 6 min read

What are Proximity Servers?

Proximity servers are applications like NearBy or Yelp which are helpful in discovering attractions like restaurants, theaters, temples, or recreational places that are adjacent to your location.

Functional Requirements

  1. Search functionality is the core of a proximity server. Users should be able to search all the proximate places within a given radius for any location (latitude and longitude).
  2. Users can add new places or edit the old ones with the basic details like images, brief description, etc.
  3. Users can give ratings(1 to 5 stars) and reviews(text and images) to places.

Non-Functional Requirements

  1. The system should be highly available with real-time search experience and minimum latency.
  2. We can expect this application to be read-heavy as there will be a myriad of search requests compared to the frequent addition of new places.

Scale Estimation

We can estimate the total number of places in the system to be 200 Million with 100K requests per second. Considering the future scale for 5 years with 20% growth per year we should build our system for at least a scale of 5 years.

  • Our system should be able to handle the load of 200K requests per second.

Basic Database Schema

Image for post
Image for post

APIs Design:

1. Search API:

GET /searchRequest: query_param = {
search_terms: "Pizza",
radius: 10, (in kilometres)
category_filter: "Italian Restaurants", (optional field)
filter: "5 start rated restaurants", (optional field)
maximun_no_of_results: 20 (optional field)
}
Response: A JSON containing information about a list of popular places matching the search query. Each result entry will have the place name, address, category, rating, and image.

2. Add places:

POST /placesRequest: body = {
place_name: "Julia's Kitchen",
latitude: 85,
longitude: 10,
category: "Restaurant",
description: "Best Italian restaurant",
photos: ["url1", "url2", "url3"] (Array of photos)
}
Response: Response of the newly created place with place_id

3. Add reviews:

POST /reviewsRequest: body = {
user_id: "uid121",
place_id: "place1",
rating: 4, (integer between 1 to 5)
review: "Restaurant was good and clean",
photos: ["url1", "url2", "url3"] (Array of photos)
}
Response: Response of the newly created review with it's id.

System Design

Before moving to the detailed component design, we need to decide where are we going to store the data. Let’s see what are different ways to store this data and also find out which method will suit best for our use cases:

1. Simple SQL based storage

Places, reviews, User details can be stored in SQL DB easily and latitude and longitude can be indexed for search optimization.

Image for post
Image for post

2. Grids

We can divide the entire world map into small squares. Considering the area of the earth is 500 Million square km, and having a fixed search radius is 10km. We will have 500M/10 = 50 Million squares with a fixed grid size of 10km. With fixing the grid size to the query radius, we will only need to search within the grid and it’s neighboring 8 grids.

Image for post
Image for post
Image for post
Image for post
Source: https://www.earthdatascience.org/courses/use-data-open-source-python/intro-vector-data-python/spatial-data-vector-shapefiles/geographic-vs-projected-coordinate-reference-systems-python/
Left:         ([x1-10,x2-10], [y1,y2])
Right: ([x1+10,x2+10], [y1,y2])
Top: ([x1,x2], [y1-10,y2-10])
Bottom: ([x1,x2], [y1+10,y2+10])
Top-Left: ([x1-10,x2-10], [y1-10,y2-10])
Top-Right: ([x1+10,x2+10], [y1-10,y2-10])
Bottom-Left: ([x1-10,x2-10], [y1+10,y2+10])
Bottom-Right: ([x1+10,x2+10], [y1+10,y2+10])
Image for post
Image for post

(50M * 6) + (400M * 8) ~= 4GB

Problems with this approach

  1. This approach will execute slow for popular places like Dam Square(Amsterdam) or Sentosa (Singapore) and this can cause an imbalance in the grids where some grids will be densely populated and others will be sparsely populated with places like coastal regions or islands.
  2. If at all we can dynamically adjust the grid sizes by keeping a threshold of a maximum number of places in a grid but this approach will not be a cakewalk and will furthermore add more implementation complexities.

What’s Next?

A surfeit of doubts may definitely arise after seeing the challenges with the grid-based design approach such as:

  1. What should be the final system design?
  2. How we can solve the extremely critical non-functional requirements of high availability and fault tolerance?

Coming up in this series

There are many more topics we need to discuss in detail. In the next few pieces, I’ll write in detail about the following topics:

  • Design Notification System.
  • Design a movie ticketing system like BookMyShow or TicketMaster.

The Startup

Medium's largest active publication, followed by +773K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store