How we started to handle 4x traffic at same cost?

Internshala is an internships platform and has grown rapidly over the past few years. The scale of the growth can be visualised by the fact that, in 2015 we had 26,000 internships on our platform while in 2018 the number rose to 190,000. Similarly the students on our platform rose by 5 times.

Eventually traffic on our servers rose but average traffic per server dipped. By the end of Sep 2018 were handling 270 as compared to 500 in 2016–2017.

Internships(red), Students(blue) and traffic average (yellow)

“270 users per servers”; It wasn’t concerning us as we believed that new features will consume more resource and hence the fall was obvious. But somewhere, it got us thinking. What if the code and queries are not optimized? What if the code written earlier, is backfiring now? Yes, our site was working fine or it seemed so. But we needed to find satisfactory answers to all the above assumptions! So we decided to break down our implementation of solutions — phase wise.

Phase 1 — Upgrading PHP

First, we decided to upgrade all versions of our technologies to the latest versions. Various websites claimed that by upgrading PHP, their requests per second doubled i;e, they were handling approx 2x traffic. Hence we also upgraded from PHP 5.6 to PHP 7.2 but the our traffic per server rose by 5% only.

Phase 2 — Reviewing Code

Reviewing code is one of the most important aspect for developers. We do it every time the module is pushed for production. It helps us find bugs, loopholes and ensures that the code is the written in the best way.

As phase 2, we decided to review the code thoroughly. Since our codebase has grown multiple folds over the time, hence we decided to figure out an entry point. For that we analysed our data from Google Analytics and picked the top 10 most visited URLs.

After reviewing all those sections, we found quite a lot of discrepancies. The code was not modular at certain places and also we were firing unnecessary queries. We also had some unoptimized loops and dead codes.

By fixing all these things, we saw a dip in CPU utilisation both in RDS and EC2. RDS saw a dip of 7% whereas EC2 dipped by 6%. Collectively our traffic handling rose by 5–10%.

Phase 3 — Enabling OPcache

We had reviewed the code and upgraded PHP also, which resulted in 10–15% increase in traffic handling. Now we needed to improve PHP performance. We enabled OPcache for the same.

Using the OPcache should not be confused with caching. The OPcache saves a compiled script(byte code) to the server’s memory. It saves the CPU utilisation and time from being used for recompiling PHP scripts on every single request.

And to our surprise, it worked like a charm! Our traffic handling almost rose by 3 times.

Phase 4 — Enabling Memcached

We use Doctrine as our ORM. Doctrine, in itself, has lots of annotations and maintains object-database mapping. Every table, in Doctrine, is an object with some associated annotations. Every time a doctrine model(table) was initialized, all the related annotations were read and executed, irrespective of whether it was the first or nth time.

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
OPcache is for accelerating code access. Memcached is for accelerating data access.

By enabling Memcached, we started caching the mapped data of Doctrine models which resulted in a further dip of CPU utilization. As a result our traffic handling graph saw a spike. Our traffic utilization had rose by 4 folds in last 6 months.

From Oct 2018 to Feb 2019, our traffic handling rose by 4 times

Conclusion

We always believed that our code was written in best way but it was not. We worked on several things and saw significant improvements. It was those few questions that made us realize that there is a lot of scope for improvement.