Wikia’s Road To PHP7

Wikia’s MediaWiki stack is now running on PHP7. And we did realize a huge improvement in terms of response time and resources used. Let’s take a closer look at how we made it.

Maciej Brencz
Legacy Systems Diary
6 min readOct 17, 2016

--

Introduction to our infrastructure

One of the main reasons we decided to switch from PHP 5.6 to PHP7 was its improved performance and resources use (real life examples from other companies were pretty promising).

When you want to read one of 44mm articles that are being collaboratively edited across 360k communities your browser gets the response from one of our 250 Apache nodes. Every day, the Apache nodes in our primary datacenter handle 190mm HTTP requests (that’s 134k every minute!). At such scale, saving even a few milliseconds of response time or dozens of megabytes of memory on each request means a lot.

PHP7 — improvements

One of the biggest changes introduced in PHP7 is a huge refactor of how hashtables are implemented that resulted in more optimal memory handling (authors claim that the amount of memory allocations dropped dramatically).

Engine exceptions are meant to facilitate handling errors in your application. Existing fatal and recoverable fatal errors were replaced by exceptions. That allows developers to handle them gracefully: log them, perform a fallback logic etc.

PHP7 — challenges

During every PHP upgrade we take the approach of porting all deprecated features to their new replacements, instead of waiting for them to be removed. Hence, we already moved away from mysql extension in favor of mysqli by backporting changes from MediaWiki core. We still had a few places where direct calls to mysql_* functions were made (these were found by php7-mar tool described below).

The change that looked particularly scary was the uniform variable syntax. Given the size of our repository it’s impossible to go through each single file and check the syntax compatibility. Fortunately, PHP 7 Migration Assistant Report allowed us to identify all the places that require the change of variables access syntax. This tool helped us a lot during the migration (not only with uniform variable syntax)!

php7mar run for our main app repository analyzed 2,6mm lines of PHP code in 12k files. 710 lines were reported as this one should be fixed (out of which 234 were critical — calls to deprecated functions, new operator used with a reference, places with variable interpolation etc.). To be honest we expected much bigger numbers. Lines of code that should be fixed were grouped by the features and assigned to proper engineering teams to be fixed. And by using as we’re here approach, we removed 43k lines of no longer executed code (mostly outdated PEAR libraries).

We were using a few custom PHP extensions that were not ported to support PHP7.

  • xhprof that we were using to gather PHP profiling data was replaced by tideways (which provides profiler data in the format compatible with xhprof)
  • test-helpers + runkit used to mock classes and functions in Wikia’s unit tests was replaced with uopz
  • sass extension that renders CSS assets was ported by our developers to support PHP7

All code changes required for PHP7 that were not compatible with the previous version were wrapped in if statements and merged to our main branch. Hence our code could be tested and run using both versions of PHP.

Rollout

We split PHP7 rollout into several phases.

Phase #1: back in June, when we were code complete, the QA team ran the regression tests suite on a sandbox machine that was upgraded to PHP7. No regressions were found, so we were ready to move forward to phase #2 — switch all dev machines to the new PHP. That included QA machines that execute unit tests for every pull request submitted on GitHub. Unit tests suite execution time in PHP 5.6 and 7 are as follows:

Time: 7.44 seconds, Memory: 112.00Mb

Time: 36.26 seconds, Memory: 212.25Mb

Do we need to point which line is for PHP7?

Phase #3: at the beginning of July we switched all Apache nodes in our backup data-center to PHP7. We mirror the traffic there (to keep caches there warm) which gave us the opportunity to check our PHP7-related changes against production traffic. No problems were identified at that stage. Additionally average response time dropped by 50%!

Phase #4: few days before the final switch, we moved our pre-release testing environment to PHP7 to perform the final QA pass.

Phase #5: July 20th was The Day — we switched all remaining production nodes to PHP7. There were no casualties. Our site performed normally. Well, we should say even better than normal. Why? Take a look below.

Graphs

We can’t leave you without showing some shiny graphs.

Few days before switching all Apache nodes in our primary datacenter, we switched a few of them. One of them is reporting its performance to NewRelic. The average response time dropped from 235 ms to less than 100 ms! Yeah, we had to double check the figures as well, but that’s how things were after the switch. After a few hours it rose to ~125 ms, but it’s still pretty impressive. As for the median value — it dropped from 130 to 55 ms (57%).

Average response time reported by NewRelic — from 235 to less than 125 ms

The more PHP processing involved in handling the response, the higher the response time improvement is. For instance requests that generate concatenated and minified assets packages are now faster by ~70%. The performance boost is obviously smaller when more database queries and HTTP calls are involved — however MediaWiki bootstrap code is executed in all cases hence there’s still some improvement.

After the full PHP switch the overall median response time dropped by 60%.

Median value of response time dropped by 60%

The node that executes offline tasks (such as statistics calculation and articles refreshing) provided us with an interesting graph as well. The memory consumption in PHP7 is much smaller — 28 GB vs 8 GB (that’s 71% drop!).

Memory consumption on Apache nodes — dropped by 71%

Each Apache node is an LXC container running inside a vmhost (it hosts only Apache boxes). Vmhost load dropped as well — from up to 20 to ~7.5 (that’s ~62% improvement!).

Apache nodes load — dropped by 62%

Savings

As a result of PHP7 migration we were able to reduce the number of Apache nodes that handle our traffic by 30%! And we still have more breathing space than before the switch, but we want to keep it to handle the ever increasing traffic. This adds up to almost a million dollars in savings over the next 3 years!

If you’re still using PHP 5.6 it’s the right time to consider switching to PHP7.

— Artur Sitarski (Ops Team) and Maciej Brencz (Platform Team)

Originally published at engineering-blog.wikia.com on October 17, 2016.

--

--

Maciej Brencz
Legacy Systems Diary

Poznaniak z dziada-pradziada, pasjonat swojego rodzinnego miasta i Dalekiej Północy / Enjoys investigating how software works under the hood