A simplified Nginx-Apache combo with Wordpress support

It looks like I have neglected to write a new article in quite a while! Shame on me. But, thanks to a website outage, I’ve finally got some more good stuff to share with you.

My previous Nginx configuration became a nightmare to maintain and WordPress had become slower because Apache’s children were being killed by OOM. This was due to a misguided PHP cache (PHP XCache to be precise) that decided to take every available bit of memory from my system, despite having max-requests per child set low (before it was purged).

This, along with my endeavors in seeking the fastest solution to everything and the introduction of a new Cloud servers by OVH, lead me to today’s article.

Which is faster — Varnish or Nginx?

The first thing I wanted to do is make all the caching happen before things get pushed through to Apache. This because I wanted to eliminate both PHP XCache and the WordPress Super Cache plugin I was using, so to increase WordPress compatibility but decrease complexity.

At first I thought about using Varnish — either as a the sole front-end, or in between Nginx and Apache (the reasoning later). Also, I had gotten my hands on OVH’s Cloud servers whilst they were still in “alpha”, and used this as the base system for building a pool of web servers.

The following tests have all been performed on those Cloud servers — mC 256 (256 MBytes of guaranteed RAM, 2 GByte total memory with excess swapped to SSD’s), 4 CPU cores and 5 GBytes of storage space. The OS is Ubuntu 10.04 LTS. The output of /proc/cpuinfo is as following (x4 for briefness):

processor       : 0
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU E5504 @ 2.00GHz
stepping : 5
cpu MHz : 1995.000
cache size : 4096 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss nx rdtscp lm constant_tsc arch_perfmon pebs bts xtopology tsc_reliable nonstop_tsc aperfmperf pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm
bogomips : 3990.00
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

The stock install of Apache performed as following on a simple “Hello World” PHP script, using “ab -c 100 -n 100000 http://host/":

Concurrency Level:      100
Time taken for tests: 29.548 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Total transferred: 25009500 bytes
HTML transferred: 3901482 bytes
Requests per second: 3384.27 [#/sec] (mean)
Time per request: 29.548 [ms] (mean)
Time per request: 0.295 [ms] (mean, across all concurrent requests)
Transfer rate: 826.55 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 12 39.1 12 1960
Processing: 9 18 49.6 14 2036
Waiting: 1 15 45.9 12 1966
Total: 14 29 65.9 26 2159
With Varnish in front of Apache, things really started to look good:
Concurrency Level:      100
Time taken for tests: 13.489 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Total transferred: 28315282 bytes
HTML transferred: 1100594 bytes
Requests per second: 7413.64 [#/sec] (mean)
Time per request: 13.489 [ms] (mean)
Time per request: 0.135 [ms] (mean, across all concurrent requests)
Transfer rate: 2049.99 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 6 2.2 6 71
Processing: 2 7 1.9 7 70
Waiting: 1 6 2.0 5 66
Total: 3 13 3.1 13 81
At 2.48x more than what Apache can send out on its own, that's a mighty impressive improvement and Varnish deserves kudos. But at 1 GBytes of RAM for caching, would it really be more efficient and quicker than Nginx? The following results tell ...
Concurrency Level:      100
Time taken for tests: 9.438 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Total transferred: 27706648 bytes
HTML transferred: 5201248 bytes
Requests per second: 10595.55 [#/sec] (mean)
Time per request: 9.438 [ms] (mean)
Time per request: 0.094 [ms] (mean, across all concurrent requests)
Transfer rate: 2866.87 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 4 1.0 4 56
Processing: 2 6 9.7 5 253
Waiting: 0 5 9.7 5 253
Total: 5 9 9.7 9 257

… a different story. Though this is not some scientific research that should be taken at face value, I personally found the difference rather significant — especially since Nginx never used more than 60 Mbytes of RAM and relied mostly on system file caching. 1.39x faster than Varnish, 3.46x faster than Apache by itself. That’s even more impressive!

A little Varnish quirk on Ubuntu

Again, and I can’t say this often enough, these are merely the numbers obtained on my system — your mileage may vary. Varnish is definitely a worthy contender — the one issue I encountered on Ubuntu was that Varnish crashed when attempting to test with more than 1000 concurrent connections. That’s not supposed to happen in a production environment!

The culprit seems to be the user account’s “open file descriptors” limitation. Sockets are also counted towards this value and when Varnish had hit the limit it died rather ungracefully. You can manually resolve it by using ulimit:

ulimit -n 65535

But you are better off using the /etc/security/limits.conf file. It is well documented, so it shouldn’t be to difficult to figure it out. I’ll continue with my blog…

The Configuration

So I have decided to keep Nginx as the front-end for Apache, but this time — unlike previously — activate Nginx’s caching. Doing it here, rather than working with caching plugins and a plethora of other band-aids, keeps the whole configuration clean and simple. Apache can be left alone to run as it normally does, with no special trickery. The only exception is a memcache store, because the database is located on a different server and linked through a VPN.

First I installed Nginx, Apache, PHP5 and Memcache through the usual channels, as following:

apt-get install nginx libapache2-mod-php5 memcached php5-mysql php5-curl php5-gd php5-idn php-pear php5-imagick php5-imap php5-mcrypt php5-memcache php5-mhash php5-ming php5-ps php5-pspell php5-recode php5-snmp php5-sqlite php5-tidy php5-xmlrpc php5-xsl php5-json

Update Nginx

The Nginx version provided by the Ubuntu repository is 0.7.65. However, a feature introduced in version 0.7.66/stable — proxy_no_cache — will come handy simplifying the configuration. 0.7.67 also fixed a small issue, which mainly concerns Windows machines but is good to have patched regardless. So I’ve compiled Nginx to the latest stable version as following:

# apt-get install libc6 libpcre3 libpcre3-dev libpcrecpp0 libssl0.9.8 libssl-dev zlib1g zlib1g-dev lsb-base
wget http://www.nginx.org/download/nginx-0.7.67.tar.gz
tar -xf nginx-0.7.67.tar.gz
cd nginx-0.7.67
./configure
— user=www-data
— group=www-data
— sbin-path=/usr/sbin
— conf-path=/etc/nginx/nginx.conf
— error-log-path=/var/log/nginx/error.log
— pid-path=/var/run/nginx.pid
— lock-path=/var/lock/nginx.lock
— http-log-path=/var/log/nginx/access.log
— http-client-body-temp-path=/var/lib/nginx/body
— http-proxy-temp-path=/var/lib/nginx/proxy
— http-fastcgi-temp-path=/var/lib/nginx/fastcgi
— with-debug
— with-http_stub_status_module
— with-http_flv_module
— with-http_ssl_module
— with-http_dav_module
— with-http_gzip_static_module
— with-http_realip_module
— with-mail
— with-mail_ssl_module
— with-ipv6
make && make install

Yes, that’s literally cut & paste. It overwrites the binaries installed by apt-get, and we happily continue to use the official init script provided by Ubuntu/Debian. Why make life difficult?

Configuring PHP and Apache

At this point, configure PHP and Apache to your heart’s content. The one thing that you need to do with Apache is move it to different port and preferably keep it on 127.0.0.1. This means you need to edit the /etc/apache2/ports.conf file:

NameVirtualHost *:8080
Listen 127.0.0.1:8080

And configure your website(s) accordingly:

<VirtualHost *:8080>
… etc …
</VirtualHost>

If you are using SSL (https://), this will be handled by Nginx rather than Apache. Since this is already getting quite long, I will skip SSL in this blog.

Configuring Nginx

We start off by creating a few directories that will be used by Nginx:

mkdir /etc/nginx/includes
mkdir -p /var/cache/nginx/tmp
mkdir /var/cache/nginx/cached
chown -R www-data:www-data /var/cache/nginx

Next we modify the file /etc/nginx/nginx.conf as following:

user www-data;
worker_processes 4;
worker_rlimit_nofile 16384;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 2000;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /var/log/nginx/access.log;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 75 20;
gzip on;
gzip_vary on;
gzip_comp_level 3;
gzip_min_length 4096;
gzip_proxied any;
gzip_types text/plain text/css application/x-javascript text/xml
application/xml application/xml+rss text/javascript;
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}

The worker_processes variable is set according to the number of CPU cores in my system, 4 in this case. There are a few tcp tweaks and gzip compression is enabled on additional file types, rather than just html. For the rest, it’s fairly run-of-the-mill.

The core workhorse of Nginx will be the proxy and its associated cache. Because I like to keep things nicely sectioned, thus easy to configure, I’ve created the following /etc/nginx/conf.d/proxy.conf file, which will be included by Nginx by an include statement:

proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_connect_timeout 90;
proxy_send_timeout 90;
proxy_read_timeout 90;
proxy_buffer_size 4k;
proxy_buffers 4 32k;
proxy_busy_buffers_size 64k;
proxy_temp_file_write_size 64k;
proxy_max_temp_file_size 56m;
proxy_temp_path /var/cache/nginx/tmp;
proxy_cache_key $scheme$host$request_uri;
proxy_cache_path /var/cache/nginx/cached levels=2:2 keys_zone=global:64m inactive=60m max_size=1G;
proxy_cache_valid 200 302 30m;
proxy_cache_valid 301 1h;
proxy_cache_valid 404 1m;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_pass_header Set-Cookie;

The proxy_set_header variables are there to help you determine the IP of the actual web page requester, rather than receiving the one from Nginx. You just need to include %{X-Forwarded-For}i in one of Apache’s log formats instead of the host (%h).

However, I have personally disabled all access logging in Apache, because everything needs to pass through Nginx anyway and it boosts Apache’s performance a smidgen (you do this by commenting out all the CustomLog lines in Apache’s configurations). I did leave the Apache ErrorLog enabled, just for those instances and also for PHP error messages.

The file above also defines an Nginx proxy cache zone called “global” with the proxy_cache_path variable. That same variable also specifies a garbage time (60 minutes) and maximum cache size (on the disk, 1 Gbytes).

The proxy_cache_key is simply a concatenation of “httpmyatus.co.uk/therequests.php” that will be hashed and then used to retrieve it at a later point. I’m allowing stale cache to be served in case of certain errors, for example when Apache has unexpectedly died.

An important bit, which was quite a PITA to figure out, is the proxy_pass_header portion for the “Set-Cookie” header. WordPress includes “Set-Cookie” headers in 302 HTTP responses (which is used to point your browser to a new location) — some frown upon this practice and Nginx is no exception. Hence we need to specifically let this pass through, or else you will not be able to login to your Wordpress Admin or have users leave comments.

Includes

In the /etc/nginx/includes folder we created earlier, we add two files. The first is a helper for sites that use WordPress. Since the /etc/nginx/includes folder is not automatically included, we can be selective about inclusions, and save on some processing time when these features aren’t used. This is the /etc/nginx/includes/wordpress.inc file:

if ($http_cookie ~* “comment_author_|wordpress_(?!test_cookie)|wp-postpass_”) {
set $no_cache 1;
}
if ($http_user_agent ~* “(2.0 MMP|240x320|400X240|AvantGo|BlackBerry|Blazer|Cellphone|Danger|DoCoMo|Elaine/3.0|EudoraWeb|Googlebot-Mobile|hiptop|IEMobile|KYOCERA/WX310K|LG/U990|MIDP-2.|MMEF20|MOT-V|NetFront|Newt|Nintendo Wii|Nitro|Nokia|Opera Mini|Palm|PlayStation Portable|portalmmm|Proxinet|ProxiNet|SHARP-TQ-GX10|SHG-i900|Small|SonyEricsson|Symbian OS|SymbianOS|TS21i-10|UP.Browser|UP.Link|webOS|Windows CE|WinWAP|YahooSeeker/M1A1-R2D2|NF-Browser|iPhone|iPod|Android|BlackBerry9530|G-TU915 Obigo|LGE VX|webOS|Nokia5800)” ) {
set $no_cache 1;
}
proxy_no_cache $no_cache;

It’s a very simple file, actually. The first portion checks if there are certain cookies set, related to comment authors or those who are logged into the WordPress Admin. If this is the case, the variable $no_cache is set to 1. The second check is for mobile users, like Nokia, iPhone, etc. This is helpful in case you have a mobile WordPress edition, as available through some plugins.

If at any point the $no_cache is 1, the variable proxy_no_cache becomes true. Apache’s output might still be cached, but it will not be served to the end user (thus always fresh).

Because the output from Apache may still be cached in this case (but not served), it is quite possible that if the page has not been requested before, it could be used to fill the cache (and thus served at a later point).

For instance, let’s say someone visits /some/page with a mobile browser. This might be the first visit to this page and will be cached. Someone using a regular browser (say, Firefox or Opera) could then be presented with this mobile cached version, causing some inconsistencies.

You can solve it by adding $http_user_agent to the proxy_cache_key statement in the proxy.conf file described earlier. The drawback here is an increased cache storage requirement, as each browser version gets its own cached version. As for the logged-in WordPress admin/user, never will he/she be presented a cached version — so this only applies if you’re using a mobile version of WordPress.

The second file is a helper that’s pretty much universal for all the websites (but can still be overridden in the actual sites-available/* files). This is the /etc/nginx/includes/default_proxy.inc file:

# Enable caching:
proxy_cache global;
# Default:
location / {
proxy_pass http://127.0.0.1:8080;
}
# Rarely changed items can remain cached longer:
location ~* .(jpg|jpeg|png|gif|ico|css|mp3|wav|swf|mov|doc|pdf|xls|ppt|docx|pptx|xlsx)$ {
proxy_cache_valid 200 3h;
proxy_pass http://127.0.0.1:8080;
}
# Deny access to .ht* files:
location ~ /.ht {
deny all;
}

The first variable proxy_cache informs Nginx to use the “global” zone we defined earlier in the /etc/nginx/conf.d/proxy.conf file. If it is not there, nothing will be cached and pages simply pass through.

It further tells Nginx to send everything to Apache, but allow images and a few other static files to be cached longer than originally defined. The last portion tells Nginx to block access to files such as .htaccess or .htpasswd right at Nginx’s level — so Apache doesn’t have to and save some processing power.

A default site

You can use the include files to build a very small website configuration file. For example, /etc/nginx/sites-available/default may looks something similar to this:

server {
listen 80;
server_name _;
root /var/www/sites/default/public;
index index.html index.htm;
access_log /var/www/sites/default/logs/access.log;
error_log /var/www/sites/default/logs/nginx.error.log;
# Includes:
include /etc/nginx/includes/wordpress.inc;
include /etc/nginx/includes/default_proxy.inc;
}

Everything is passed to Apache and cached, depending of the wordpress.inc file allows it. Apache will handle the rest. You will likely have to change the directories, but that’s basically it.

WordPress

There’s little that needs to be done with WordPress. The most important thing is to actually disable any WordPress cache you may be using, such as WordPress Super Cache. It is no longer needed and only gives Apache / PHP more work to do. However, as noted earlier, I did include Memcache.

The reason is that in my case, each Cloud server works off the same MySQL database cluster. To avoid unnecessary or repetitive SQL traffic, the Memcache daemon will hold these in RAM memory (or in the Cloud’s case — either RAM or SSD). This is done with the use of the object-cache.php file by Ryan Boren, which can be obtained from this website. This file needs to be placed in your $WP-ROOT$/wp-content/ directory.

For everything else, WordPress can be plain vanilla but become blistering fast, as shown in the next output.

Performance

I have clustered a Cloud server with a dedicated server. For a short while (as in, half a day) I used HAProxy as the point-of-entry. HAProxy is super-fast, but I was irked by a minor issue that caused some logging issues. Nginx is on-par with HAProxy, though it might have a little more jitter, so I now use an 2x Nginx ←> 2x (Nginx + Apache) combination. Witt the (Nginx + Apache) portion of this setup configured exactly as described above, I have been able to obtain the following speeds (based on 100 concurrent connections, 50,000 requests and keep-alive enabled):

Concurrency Level: 100
Time taken for tests: 6.694 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 2822197200 bytes
HTML transferred: 2806393092 bytes
Requests per second: 7469.02 [#/sec] (mean)
Time per request: 13.389 [ms] (mean)
Time per request: 0.134 [ms] (mean, across all concurrent requests)
Transfer rate: 411700.31 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 0.4 2 16
Processing: 3 11 0.8 11 27
Waiting: 1 3 1.2 3 25
Total: 6 13 0.8 13 32

At 3.29 Gbps @ 7469 requests per second, I consider this to be a rather well performing setup. Well prepared for my next project!