PHP7, CakePHP & High Traffic Websites on EC2
Recently I oversaw a project to move two large high traffic sites from PHP5 on Apache to PHP7 on nginx. Once the task was completed we saw extraordinary results: enormous increases in CPU speed coupled with equally enormous drops in servers.
But before we reached the finish line there were plenty of obstacles to overcome and I promised myself that when we were finally done I would write a post explaining all that we had faced and how we eventually triumphed.
So how big are these sites?
According to Google Analytic, 100s of million page views per month. Each.
According to Amazon stats (i.e. including bots, spiders etc), we are doing almost a billion requests per month. Each.
So we set about benchmarking performance:
- PHP5 and Apache (current set up)
- PHP 5 and nginx
- PHP 7 and nginx
What we found out was that PHP 7 was not just fast, it was ridiculously fast and that nginx would seem to outperform Apache on our current application stack when we were bombarded with a sudden wave of traffic.
So, no problems, then. Let’s crack on, we thought, we’ll be done by teatime.
Step 1 — install nginx
So for the coding team, this bit was easy — just hand over to the tech dept, sit back and relax.
We had a few small issues when it came back, mainly to do with mod rewrites, but the general scheme to follow is here.
Step 2 — Install PHP7
We followed a few steps to install PHP7 on our Ubuntu nginx box, and added the following modules:
Step 3 — update CakePHP
CakePHP needs to run off at least 2.8. We were running off 2.4, so there were a few steps needed here:
- replace all
String::functions (reserved PHP7 word) to
plus a few more that can be found in the helpful migration guides.
But so far we were flying. Next stop paradise!
Step 4 — internal testing
Testing didn’t throw up any unexpected errors. A few here to do with the new CakePHP and a few there to do with PHP7, but generally we were stress testing and unit testing with no errors. No matter whether it was working copies, development server or preprod servers the new system was flying and we were itching to try it out. It was all going so well.
We decided to do some live testing.
Step 5 — limited live testing
Our plan was to launch one server and stick it into the live load balancer.
Our current set up was c4.xlarge instances — ($0.226 per Hour) — and we were running around 800–900 unique visitors in GA realtime per server. So if we were getting 10,000 uniques in GA realtime, we would be running around 12–13 servers.
We were going to stick one of the PHP7 servers in there with health checks but no scaling — meaning that a server could be terminated but not scaled up/down.
Adding the PHP7 server in was a great moment. Within minutes, it was running at 80% CPU and the PHP5 servers were able to drop 3 servers — this server seemed to be doing at least 3x the amount of traffic that the others could handle, and the load balancer favoured it so much that it was running at the edge of the limits we had deemed acceptable.
We ran it for an hour or so and took it down and looked at the logs to search for anything untoward.
Nothing much showed up, so I decided for a bigger test the next day.
Step 6 — longer live testing and disaster
As soon as I got to my desk in the morning, I launched a PHP7 instance and kept it in the loadbalancer, and it started off by working like a charm.
At around midday, we had calls from the client saying that there was an intermittent problem on the site — the homepage was missing for mobile users.
I looked at AWS ELB monitoring and the PHP7 server was getting a lot of 500 errors, so I shut the server down and examined the logs.
We were getting a ton of errors before the server crashed. And not normal errors by any means.
Say you have a model called Foobar. We had errors like
Model foo^bar does not exist
Model Fo¨bar does not exist
Essentially at some point after running smoothly, the server jumbled prefect code and truncated controller names, model names, table names, in fact almost any name at all. And when it started failing it didn’t stop. It would never recover after the first 500 error.
There followed a period of around ten days of launching instances, watching them closely, seeing them fail. Again and again and again.
But then we tried something different.
We launched another PHP7 server on the other site and it stayed up for days.
We started step by step, comparing differences, grasping at straws until one day, I made a very strange discovery.
Step 6 — that Eureka moment
After hours of development time, I discovered that view caching wasn’t saving files on the working site.
CakePHP has a number of different built-in caching methods. Models and persistent store data about the structure of the MySQL tables. Then any data from the DBO can be stored if you choose, and views or template views can also be stored.
Now, models, persistent and data can be stored in various formats, such as flat files, APC, APCu, memcached, Redis etc.
View files, however, can only be stored (at the time of writing) as flat files. And if the directory doesn’t exist, we assumed it would create it, or at least complain when debugging was turned on.
Now, at some point during the second (working) site’s GIT history, someone had deleted the /app/tmp/cache/views/ folder and, without realising, we had been running without a view cache for years.
Now we had tried everything else on our main (failure) site, we supposed that this was the only thing left to try. We deleted the /app/tmp/cache/views/ directory and launched.
The site stayed up for 24h, then 48h, all without a hitch.
It seems like something either in the IO processes of PHP7, or something within the CakePHP framework cannot cope with high volume traffic and writing/reading flat files. It then produces gobbledegook code that gets cached forever.
The only solution is to turn off view caching, but don’t worry as your site will fly by on the new infrastructure, so don’t worry.
We reduced the compute capacity of our autoscaling master by half and the site was still twice as fast, with fewer servers than before on the old PHP5 set up.