Anant Pathak
3 min readJan 2, 2021

Designing a scalable Notification Service

In software engineering, we often come across use cases where-in we need to notify users about particular events, for example, the time when a sale is going to start, or the tracking information of their order. These events can be an email or an SMS or any other form of communication that needs to be sent out to the users. In a micro-service based architecture, services might need to send such notifications and instead of every service writing their own Notification service, we can have a generic service which can be offloaded this task. This service needs to be highly scalable and available.

Let’s have a look at how the High Level Design for such a system would look like. We will have an ELB that will point to the App Service, the service which exposes an API to handle incoming notifications. It is responsible for receiving the requests, logging the requests, decrypting the request body, sending the metrics data and finally, posting the messages to a queue, which might be a Kafka topic here. The App service does little/minimal processing. It mostly does I/O operations and if the number of clients using the Notification service increases, this service needs to be highly scalable. An ideal choice might be using Node.JS, which can help scaling this service with its single threaded and event loop model.

The clients of the Notification service need to define the configuration which will be used to communicate the notifications. The configuration could specify 2 things:

  1. Channel of communication
  2. Subscribers

The channel could be SMS/Email or any other means of communication. For every mode of communication, we can have a separate service which will just be responsible to handle the specific kind of communication, for example, Email service will have an SMTP server and SMS service will have MessageBird integration. This way it will be easier to scale these services separately. Subscribers tells us the list of users who need to be communicated through the channel. The Admin service will be responsible for allowing the configuration with SSO integration or other ways of authentication and authorisation. The Admin service will expose a set of API’s to do CRUD operations to integrate with the Notification Service. These configurations will be stored in a relational database so that it can be read by the Configuration Service.

The Sender service is responsible for consuming the messages from a Kafka topic and sending them out to the subscribers. To determine the list of subscribers and the appropriate channel, Sender service calls the Configuration service to fetch details with respect to the client id. To avoid going to the configuration service for each message consumed, we can have a Cache in between which enables fast retrieval. The cache will be updated whenever there is an update in configuration from the Admin Service. Now, these notifications can be bulk notifications as well. In that case, sending out the messages to a long list of subscribers could be time consuming and error prone. To handle these scenarios, we can define a task, which will be responsible for sending message to each subscriber and each task will run in a different thread. We can configure a thread pool for the same. For handling scenarios where a thread fails to send a notification, we can store it in a database, probably relational one. The Async service is then responsible to pick the failed notifications and try sending them again, updating the number of retries in the database. Clients can also define the max retries for a particular notification.

The above explanation sums up the High Level Design of building a Notification Service. Obviously, there are many other aspects that need to be considered while building such a service, like, security, encryption/decryption of messages, order of messages, priority of messages, duplicate messages, monitoring, etc. The above architecture will give you an idea as to how the system could look like when starting out. If you find out any issues in the design, please let me know in the comments.