Profiling PHP processes

Flaviu Vadan
Vendasta
Published in
5 min readJul 10, 2019
PHP logo gratefully sourced from SitePoint

WordPress is one of the most popular Content Management Systems [1]. It is written in, and based on, PHP and MySQL. WordPress is mostly associated with blogging but it can be used to build other types of websites as well. A WordPress site is typically hosted by a provider that offers hosting as a service. One of the challenges associated with WordPress hosting is creating the necessary infrastructure to coordinate PHP processes that compile and run sites’ PHP files, responding with the output.

One of the available tools for executing PHP scripts is PHP-CGI (php-cgi). A CGI (Common Gateway Interface) is a collection of tools for running software on a server (the host) in a platform-independent manner [3]. PHP-CGI, when used for executing PHP scripts, allows users to provide a series of arguments that configure how the PHP process behaves, what files it caches during execution, etc. A PHP-CGI configuration that has the potential to make a big difference in scripts’ execution time is open_basedir.

open_basedir is responsible for controlling file system (where sites’ PHP scripts are stored) access patterns [4]. This option allows WordPress hosting providers to restrict which directory tree a PHP process has access to. For example, if there are two sites stored as sites/site1 and sites/site2, the PHP-CGI process can be configured to run in such a way as to deny access to sites/site2 while executing scripts associated with sites/sites1.

While open_basedir does allow access to be controlled conveniently at the PHP-CGI process level, it comes at a price because of its association with realpath_cache_size . According to the open_basedir documentation:

Using open_basedir will set realpath_cache_size to 0 and thus disable the realpath cache.

Also, according to the documentation of realpath_cache_size [5]:

Determines the size of the realpath cache to be used by PHP. This value should be increased on systems where PHP opens many files, to reflect the quantity of the file operations performed.

The more features that are added to a site, the bigger it gets in size occupied on a shared WordPress host’s filesystem. To understand the consequences of setting open_basedir to some value, the PHP-CGI process can be profiled using tools available in the BCC collection [6], such as syscount, or Linux profiling tools such as strace.

To illustrate the difference between the file access patterns when open_basedir is set and when it is not set, let us look at the number of lstat [8] calls, as measured by strace. Here’s the set up:

  • A virtual machine (VM) on Google Cloud Platform (1 vCPU, 3.75 GB memory);
  • A PHP-CGI server on the VM, which listens for incoming requests for a site;
  • A site with 403.6Mb of source files and a database of 36.1Mb.

This can be performed on a personal machine, so you don’t need to worry about GCP or VM access.

Executing the following script:

resulted in the following output when using open_basedir:

Status code: 200 Took: 4.47233510017s
Status code: 200 Took: 3.66385793686s
Status code: 200 Took: 4.40820717812s
Status code: 200 Took: 5.34910821915s
Status code: 200 Took: 3.63655209541s
Status code: 200 Took: 3.66301393509s

After disabling open_basedir, the result is:

Status code: 200 Took: 2.66961598396s
Status code: 200 Took: 2.8868329525ss
Status code: 200 Took: 2.71920204163s
Status code: 200 Took: 2.83208203316s
Status code: 200 Took: 2.79734110832s
Status code: 200 Took: 2.88727807999s

All requests were uncached and they went from an average (of a small sample) of approximately 4.1 to 2.79 seconds! If we take a sample result and count the number of lstat calls in the strace output, we see (with open_basedir):

And without open_basedir:

(the second screenshot was taken after transferring the strace results to my machine) That is a significant difference! We went from approximately 500k lstat calls, to 5k, a two-fold difference! It is undeniable that the lstat calls take microseconds to execute. However, performing that many times and, in addition, performing other calls as well, not only lstat, causes a performance bottleneck. The output from the BCC tool syscount was consistent with the number of lstat calls reported by strace.

Setting open_basedir definitely caused this test site to be served much slower. The performance gain is a result of caching the existence of files. Now that open_basedir is not set, realpath_cache_size can also be controlled to increase the cache size, according to needs. However, as appealing as it may sound, it does come at a cost — not setting it represents a security concern. Because the PHP execution is no longer controlled at the process level, there are other measures that have to be taken to ensure a secure environment, such as “jailing” processes [9, 10].

The effects of open_basedir on WordPress, and PHP processes in general, may not be fully understood by all WordPress developers. Because of this potential lack of awareness, it is important to document the outcomes associated with setting open_basedir, or disabling it. By providing a comparison between setting and disabling it, this blog aims to make WordPress developers aware of issues they may be faced with and what are some potential ways to approach profiling PHP scripts execution. Notice that the tools outlined by this blog do not pertain only to WordPress developers, or specifically, PHP. The BCC collection can be used on any Linux system, along with strace, for profiling various processes.

--

--