Lifeprof: our bespoke solution for reviewing your PHP app’s performance over time

Timur Shagiakhmetov
Apr 9, 2019 · 9 min read

App performance is one of the key criteria for evaluating developers’ work. Profiler is a tool used for optimising the performance of PHP apps in particular.

If you’re seeing an increased response time but you’re not sure which parts of the code are causing it, you can use XHProf to analyse the app’s performance. It’s a PHP extension that allows you to profile your code on a prod server, and keep improving it afterwards.

But wouldn’t it be good to have the performance history, so that you could detect what exactly caused the drop, and when?

For that purpose, we developed Liveprof about a year ago: a tool that automatically profiles all queries and analyses any changes in an app’s performance, finding the parts with the biggest drops. You don’t even have to launch it and wait for the statistics to accumulate — it’s always active, and it collects data for all types of queries.

In this article, I’m going to discuss the development details and particular characteristics of this tool.

A few words about XHProf

First of all, I would like to touch on the capabilities of the tool itself. It is a PHP profiler written in C as an extension. It was developed at Facebook and open-sourced afterwards, and it has several forks (uprofiler, Tideways) that are fully compatible at the data output level.

XHProf places timers around all function/method calls. Using it causes some additional minor charges, but on the plus side, you can use it on production.

As a result, you get the element array in the following format:

where

parentMethodName and childMethodName — parent and child method respectively;

ct — the number of calls in the query context;

wt — request execution time (consists of the time spent by the processor and I/O wait time or time of waiting for the response from another service);

cpu — time spent by the processor processing the request;

mu — change in memory footprint after the method call;

pmu — change in peak memory footprint after the method call.

There can be some other parameters as well.

Additionally, XHProf has the instruments for visualising the obtained results. For each profiling operation, you get a table with the list of parameters per method.

For example:

Xhprof provides a UI interface so that you can view results

You can go into each method and see what resources were used by each of them respectively, and also have a look at the call graph with the most resource-heavy methods highlighted:

XHProf is useful for analysing the performance of each query manually, but it is equally important to see the overall pattern. We need to understand how the app performance changes over time. For that purpose, we created an instrument that automatically profiles queries and allows us to analyse them in the web interface.

Liveprof: aggregating results and storing history

How can we get the profiling history?

First of all, profiling should be automated. Response times may be unstable or vary from one request to another. To minimise the influence of such variations, we use the data average from several queries. As a result, we get aggregated data for each query — the mithe nimum, maximum, average and the 95th percentile, for example. This helps us find tricky data that isn’t called with every query.

Our tool has both advantages and disadvantages.

An aggregator can:

  1. automatically profile every Nth query;
  2. aggregate collected profiles on a daily basis;
  3. allow you to see the graph of changes for any parameter measured by the profiler — for example, wt, cpu, mu, and pmu, as described above;
  4. check performance changes for any method for a specific period of time;
  5. generate a flame graph based on the latest aggregated data;
  6. find requests with a specific method called within them.

Restrictions:

  1. Since this is an aggregating tool, you cannot check the productivity of a specific query (for example, the slowest one) — you only get average data for the last 24 hours. However, this is enough for evaluating the overall productivity dynamics. If one of the queries has slowed down, it will influence the average data, 95th percentile and maximum execution time.
  2. You cannot be sure that you have restored a full call stack, as XHProf only returns unique “parent-child” pairs with the evaluated amount of resources spent.
  3. There is some inaccuracy in execution time, which is caused by the additional charges on XHProf. The difference is not big, but you should factor it in when measuring execution time.

How to use the profiler

First of all, you need to connect the profiler to the website or to the script. It will be easier with the tool used for an automated profiling start. You can run the profiler in three modes, depending on the storage type: database, files or api. Here is an example of the database mode:

It supports all PHP versions, from 5.4 onwards, and using it causes minimal additional charges which allow for it to be used on production. The tool automatically defines the profiler extension: XHProf, uprofiler or Tideways. While launching the app, you need to enter the parameters for the database connection, as well as the profiling settings.

Here’s an example of using it in the code with the default settings:

Profiling results are saved in the database, files or remote server, depending on the mode you selected. Aggregation is launched once a day. For this to happen, all entries for a specific request are collected, and aggregated functions are calculated for each parameter. Aggregation functions can be expanded or overridden.

Currently, the minimum, maximum, average and 95th percentile per day are available to use.i

For setting up aggregation and viewing results, the aggregator’s web client is used. It is easier to install it in the docker container:

Before the first launch you need to change the database connection settings, the list of fields and the list of aggregated functions used in the configuration file src/config/services.yaml. After that, you need to launch an installation script:

Make sure you write automatically launched aggregation scripts and data clearing scripts in Cron:

You can run a script to fill it with the test data:

Interface description

By default, a page with the list of aggregated queries will open. This allows you to easily find the query you need, sort all queries by any parameter, and re-aggregate a specific query in order to see the latest results.

While working with the tool, you will find the page with the list of methods and the graph of productivity changes the most useful. It allows you to view the call stack and monitor the consumption of each parameter, as well as the graphs of productivity changes for a specific period:

The page with the complete list of called methods allows you to find the required method quickly and check the graphs on the graph page.

The page with the flame graph of the last aggregated query allows you to define the heaviest parts visually.

Using XHProf imposes some constraints on the accuracy of the results. This is caused by the profiler not returning the full call tree, but only the “parent-child” pairs. Furthermore, if any pairs were called from different parts of the app, you will get the total time spent as a result. However, you need a full call tree for the flame graph.

In the process of restoring such a tree, the parameters’ values return to normal with the account of time spent by the parents.

The page with the list of methods that started working slower during the specified period of time.

Additionally, you can check which of the child calls had the biggest impact on productivity for each method. You can see on the screenshot that the ServiceApi::getAvailableServices() method slowed down for 116 ms. This was caused by adding the ServiceApi::getGifts() method (+56 ms) and increasing the amount of calls for the ServiceApi::getConfigForList() method from 1 to 5 (+50 ms):

If you don’t know in advance which query had the biggest productivity change, you can use the page with the list of methods that started working slower without reference to any specific query:

Page with the search of the queries with the specific method called within them.

It allows you to compare the execution time for different queries. It can also be used for searching the unused code:

Customisation opportunities

Customisation opportunities

The tool has a lot of opportunities for customisation:

  • you can add your own aggregating functions for calculating a value based on the values array of the parameter;
  • you can change the database storing the profiles and aggregated results (SQLite, MySQL and PostgreSQL are already supported, but you can use any other from the list available for Doctrine DBAL);
  • you can override the connection manager and implement your own methods of retrieving data;

you can use the web interface not only as a standalone project, but also inside any framework (for example, website control panel). Here’s an example:

Conclusion

I hope our tool will be useful to you, and to many others. It allows you to check productivity changes in any part of the code without using additional timers, and also simplifies the optimisation process so that you can see what caused the productivity drop in your app.

It is available on Github and web interface. The demo is available here.

The tool is still being developed and may have some errors, but with the community’s help, I hope it will become even better. We are planning to start supporting more profilers beyond XHProf, and enlarge the list of supported databases.

Please send us your feedback, bugs and pull requests on GitHub. Your comments and suggestions are welcome!

Bumble Tech

This is the Bumble tech team blog focused on technology and…