My First System Design (WhatsApp)

samriti gupta
4 min readApr 27, 2021

--

High Level System Design for Whatsapp

If you are in the software industry, one must have faced a question in interview to design ‘X’. Honestly, when I was first asked to design Bookmyshow during my first system design round, I was clueless. I was facing a real problem

- From where to start?

- What exactly the interviewer is looking for?

- What all should I tell API’s, class diagram, system as a whole?

Frankly, I was confused and in my confusion I couldn’t even understand the clues that interviewer was providing.

….

There started my system design journey. I went through many courses, youtube videos, github links, etc. It’s always hard to create your first system design because you have so many questions popping in your head and don’t know where to find their solutions. All the videos and tutorials I saw, gave a basic idea of the system design but that was not enough. From their perspective design seemed to be same. (add load balancers, APIs, services, etc) but the real question was how exactly the data is moving?

….

In this article, I will try to help you understand whatsapp system design from my perspective. You might feel a bit confused after seeing the diagram, don’t worry I have divided the article and will try to explain one workflow at a time. This article will mainly focus on forward proxy, reverse proxy, how users connect with Gateway and the need for service manager.

  1. Forward proxy vs reverse proxy:

A forward proxy provides proxy services to a client or a group of clients. Often, these clients belong to a common internal network. A reverse proxy does the exact opposite of a forward proxy. While a forward proxy proxies on behalf of clients (or requesting hosts), a reverse proxy on the other hand proxies on behalf of servers. A reverse proxy accepts requests from external clients on behalf of servers stationed behind it. (As shown in diagram)

Forward proxies are typically used internally by large organisations, such as universities and enterprises, to

  • Block the access to certain websites for their employees
  • Monitor online activity of employees
  • Block malicious traffic from reaching an origin server
  • Improve the user experience by caching external site content

Similarly, reverse proxies are used by servers to filter out the incoming and outgoing requests to protect backend-systems.

In many system design diagram you can see Gateways and reverse proxy servers being referred interchangeably because reverse proxy server are the gateway between users and your application’s origin server.

2. How users connect with Gateway?

  • Users and Gateways are connected via web sockets (TCP Connections). Github link: TestWebSockets
  • New user or existing user once disconnected sends a request to the gateway for connection. After validation, gateway establishes a new TCP connection with the user and stores the session (storing session handles via service discussed later) to avoid overhead of connecting again and again and to reduce the latency for sending messages.
  • At a time one server/gateway can handle tens of thousands of active TCP connections.

3. Role of Service Manager

How does clients/gateway know about the available instances of services mentioned in the diagram above? How to determine the location of a service instance to which request is to be sent?

First solution would be for the gateway to manage all the available services and its active instances but gateway’s main responsibility in this case is to handle web sockets/connections and its not a good idea to overload your user facing server with lots of responsibility as chances of failure increase. Moreover, it’s always the best idea to decouple the responsibilities.

Better solution is to implement a service registry (our service manager), which is a database of services, their instances and their locations. Service instances are registered with the service registry on startup and deregistered on shutdown. Client of the service and/or routers query the service registry to find the available instances of a service. A service registry might invoke a service instance’s health check API to verify if it is able to handle requests. Example of working service registry project on GitHub Exchange with Microservices.

Service Registry Flow

Gateway maintains cache of recently used services and its available instances to avoid sending request to service manager time and again.

Client Side Load balancing can be done for the given instances.

If instance is inactive, responsibility of service registry is invalidate gateway’s cache.

Gateway used in the above post can also be interpret as reverse-proxy server, loadbalancer-3 or 7 but I believe that truly is the trade-off that one needs to discuss with the interviewer.

Please share your review and feedback.

For Part-2, click here.

--

--

samriti gupta

Full stack developer with 4 years of experience building scalable large web application. Currently, working with Facebook, London.