Serverless Chat Service on AWS

Published in

Remote Serverless

5 min readJul 7, 2022

Building a chat application is a challenging task. Building it in a serverless manner is even more challenging. But let me make that simpler for you so that you don’t have to go through the same hardships we faced while building the chat feature at Hire the Author.

Just like any other system design process, let us start by listing down the requirements.

What do we need in a chat application?

A user needs to be able to view their active conversations, sorted in descending order by the last message timestamp.
A user needs to be able to send a message to another user.
A user needs to be able to receive messages in real-time.
A user needs to be able to view their messages, sorted in ascending order by the last message timestamp.
If a user is offline when someone sends a message to them, the system should send an email notification with the new messages after 15 mins.

As you might agree, those are some basic features that we would expect from a chat application. There can be other add-ons like being able to send messages to a group conversation, uploading files, replying to a specific message in a conversation and the list goes on.

But let's start from the basics and address the most important concern here: What is the best way to enable the user to receive messages in real-time?

Websockets

HTTP is a request-response protocol, meaning you raise a request to the server to fetch a piece of information, and then the server responds back. In our case, we need our chat application to send us the information in the event when someone sends a message to us. This calls for an event-driven protocol so that the server can reach out to us — irrespective of the browser or the device on which we use our web application. We also need the communication to be bi-directional meaning the client should also be able to send information to the server with minimal overhead. This is where WebSockets comes to our rescue!

Websocket establishes a persistent connection between the client and the server. This enables back and forth communication between the client and the server.

Going Serverless

Implementing WebSocket APIs in a non-serverless manner is pretty straightforward. But why not take the road less traveled and implement it in a serverless manner?

As you might know, all the logic in the AWS Serverless world is handled by Lambda functions. But they are not so good when it comes to maintaining persistent connections. This is because the lifetime of a Lambda function ends when all the statements within the function are executed. The next time the same Lambda function is called, a new instance of the function is launched in a new container. This is the same reason why Lambda functions have the infamous cold start time.

So let us move up to the next level and see if there is someone to help us out. This brings us to our savior — The API Gateway.

API Gateway Websocket Service

WebSocket APIs were introduced by AWS as part of the API gateway around late 2018. The WebSocket APIs act as a stateful frontend for our Lambda functions. Every time a new client connects to the WebSocket API, a connectionId is generated which allows us to uniquely identify the client and communicate with them. This connectionId is then persisted by the API Gateway so that when we want to send some message back to the client, API Gateway intelligently interprets the connectionId and sends it to the respective client session. The WebSocket API generated by API Gateway is in this format:
wss://xxxxxxxx.execute-api.us-east-2.amazonaws.comwhere wss denotes the WebSocket protocol. The client-side web application uses this endpoint to send messages to the server.

In order to handle the WebSocket messages, API Gateway provides three predefined routes:

$connect: API Gateway calls the $connect route when a persistent connection between the client and a WebSocket API is being initiated. We can then configure a Lambda function as the event handler. Given below is the Serverless code for achieving the same. The main responsibility of this Lambda function would be to handle auth and store the connectionId against the userIdon the backend.

handleSocketConnect:
    name: LAMBDA_socket_connect_${self:provider.stage}
    handler: chatController.handleSocketConnect
    events:
      - websocket:
          route: $connect

$disconnect: API Gateway calls the $disconnect route when the client or the server disconnects from the API. For example, while closing the browser tab.

handleSocketDisconnect:
    name: LAMBDA_socket_disconnect_${self:provider.stage}
    handler: chatController.handleSocketDisconnect
    events:
      - websocket:
          route: $disconnect

$default: All the other WebSocket messages including the non-JSON ones will be routed to the default route. We used the $default route to handle the chat messages but a better way to handle this is by providing a custom route let’s say messages. I do not intend to complicate this blog further so you can find more details on how to set up custom routes here and here.

defaultSocketHandler:
    name: LAMBDA_socket_default_${self:provider.stage}
    handler: chatController.defaultSocketHandler
    memorySize : 512
    events:
      - websocket:
          route: $default

So far we have discussed how to set up a basic WebSocket connection between the client web app and the backend using API Gateway and Lambda. But we have missed handling an important aspect — authentication. How do we map a connectionId to a user?

Handling Auth

This seemed a little tricky to figure out at first. On the $connect route, we configured an additional Lambda function called authWebsocket and set the identitySource to be route.request.querystring.Authorizer. This makes the value of the query string parameter with the name Authorizer available in the authWebsocket function under the field event.queryStringParameters.Authorizer. The query string parameter will be sent as part of the WebSocket API from the client-side in the format: wss://xxxxxxxx.execute-api.us-east-2.amazonaws.com?Authorizer=eyJsdhsdsiduis. The authWebsocket function then decodes the token to obtain the relevant userId and pass that down to the $connect handler. The $connect handler takes care of storing the connectionId against the userId. Since the API Gateway injects the connectionId at event.requestContext.connectionId for all the routes, the subsequent WebSocket messages need not worry about handling auth but rather map the connectionId to the userId.

handleSocketConnect:
    name: LAMBDA_socket_connect_${self:provider.stage}
    handler: chatController.handleSocketConnect
    events:
      - websocket:
          route: $connect
          authorizer:
            name: authWebsocket
            identitySource:
              - 'route.request.querystring.Authorizer'

Tada! This how the entire flow looks like in the end:

That’s it for this one. I totally understand that this was a bit too much to process so I will cover the rest of it in another blog. There is a lot of exciting stuff coming up like how to model the DynamoDB data to support the queries efficiently and how to design a system to send the email notifications after 15 minutes. Stay tuned for more Serverless stuff coming up!

If you’ve got any questions on this blog or want to consult me to build similar stuff, feel free to get in touch with me via

Maria Zacharia

Co-Founder and CTO at Hire the Author I work with businesses and individuals around the world to provide highly…

www.hiretheauthor.com

Happy to share and learn!