Enhancing Server Performance: A 75% Reduction in WhatsApp API Load Through GoLang Microservice

Published in

Internshala Tech

3 min readJun 15, 2024

Tech Stack:

Main Server: CodeIgniter with PHP backend.
Microservice: GoLang script.

Recently at Internshala Trainings, we faced an unexpected challenge with the WhatsApp Cloud API that significantly impacted our server performance. Here’s how we addressed it and optimised our system.

Whenever we sent a message to a user, WhatsApp responded with up to three status updates. For instance, sending “Hi” could result in statuses like Sent, Delivered, Read, or even Failed. Additionally, if a user replied, it would start another cycle of messages and responses.

Internshala to WhatsApp request-response flow

To put it simply, for every message we sent, our servers could receive up to five responses. This influx of responses created a sudden load on our main website servers (EC2s), causing performance bottlenecks and reducing their ability to handle traffic efficiently.

Throughout the day, when the traffic was constant, everything went smoothly. However, during the evening when most of our cron jobs run, we saw a sudden spike in the CPU utilisation of our main traffic-handling server. We have a separate internal server for running all our cron jobs, so on one end, cron jobs are sending thousands of WhatsApp communications to the users. On the other end, we got three times the requests that we sent to our main server for processing WhatsApp’s webhook responses based on the response type, storing them in AWS RDS, and replying back to all the users. This not only created a sudden spike in CPU usage but also increased the IOPS for the RDS, thus consuming crucial bandwidth that we should have used for our traffic. This situation might even lead to launching a new server (EC2) instance to handle the extra traffic.

Now you might wonder why this is such a problem. Our entire web application runs on PHP, which has its own drawbacks when it comes to the amount of traffic it can handle and the resources it consumes to run different PHP-FPM processes.

To address this issue, we decided to reroute the webhook responses from the WhatsApp Cloud API to a microservice. This microservice was configured to run a GoLang script. Its primary task was to handle all the responses from WhatsApp, process the status responses (since our application logic doesn’t depend on them), and send all other responses to our main server EC2 instance.

Internshala to WhatsApp to Microservice request-response flow

Here’s how it works:

Microservice Handling Status Responses: The microservice processes all status responses and sends them to an S3 data lake using AWS Kinesis Firehose instead of storing them in AWS RDS, which reduced the cost at the RDS level. And if we need to check the status of any message sent to users, we can easily run a query on AWS Athena and retrieve the data.
Forwarding Incoming Messages: Only the incoming messages are sent from the microservice to our main server. Now the main server can efficiently handle the user’s message response and revert them back much quicker. This significantly reduced the load that was coming to our main traffic-handling servers.

In conclusion, by offloading the status responses to the microservice and only forwarding the necessary incoming messages to our main servers, we achieved a substantial reduction in load. With this simple rerouting, we were able to reduce up to 75% of the load on our main server that came from WhatsApp’s webhook responses.

Go-Live Date: 29th May 2024

Thank you for reading! I hope you found this article helpful. If you have any questions or suggestions, please leave a comment. Your feedback helps me improve.

Don’t forget to follow for more updates.

Enhancing Server Performance: A 75% Reduction in WhatsApp API Load Through GoLang Microservice

Written by Vineet