How to Achieve Better Accuracy in Latency Percentiles in JMeter Dashboard
(with Viraj Salaka)
There are a number of ways to evaluate the performance of systems using the data collected during a performance test. The latency analysis is one such important analysis in which we analyze the behaviour of the latency. This analysis can be as simple as calculating the average latency/mean latency/latency percentiles or it can be rather complex in which we fit distributions to the data to study the characteristics of the latency distribution.
The latency percentile is an important performance metric which is used to analyze the latency. Since it measures the percentage of requests that has latency below some value, it can be considered as a metric that measures the quality of service of the application/system being evaluated. For example, if 99% latency percentile of your system is equal to 5 ms, it means that 99% of requests served by the system will have latency below 5ms. In the case of large datasets, there are methods to estimate the latency percentiles. The accuracy of results produced under these methods may vary depending on the underlying algorithm and parameters used.
The Apache JMeter™ is a great tool which has been designed to load test functional behavior of applications and measure performance. At WSO2 we use JMeter to test the performance of most of our products. JMeter has a great set of features such as ability test various protocols/applications/servers, IDE that allows fast test plan development, dynamic HTML reporting, multithreading and scriptable samplers and ability to test using a large number of concurrent users (which is achieved by running multiple instances of JMeter).
When we run performance tests we can configure the JMeter to create text files containing the results of a test. These files are called JTL files. Since the JTL files contain latency values for each request, we can use this information for latency analysis. This can be done using various listeners (e.g. Aggregate Report ) that are already available in JMeter or by loading the JTL file into a statistical software (such as R).
Recently we have started using the JMeter Dashboard for obtaining performance results. JMeterDashboard can generate graphs and statistics from the JTL. While analysing the latency percentile values in JMeter dashboard we noticed that for certain scenarios we tested there is a significant difference in the actual (exact) latency percentile values and the percentile values calculated in the JMeter dashboard. The exact value was calculated using R (statistical software package). Interestingly enough JMeter aggregate report produced the same result as R.
For example, see the following result:
Note the following:
There is no difference in the average latency
There is no difference in the throughput
90% is significantly higher in the dashboard
95% is significantly higher in the dashboard
99% is higher in the dashboard
The above result was obtained by loading JTL file of a 10 min performance test. The total time of the test was 15 min and the first 5 min was the warm-up period. The total number of requests in the test was 3421980.
Improving accuracy in the latency percentiles
The way to address the above is to increase the default value of the following property jmeter.reportgenerator.statistic_window. Note that this property only affects the latency percentile values (because it is only used in the PercentileAggregator class the component implemented for latency percentile calculation in Jmeter).
The following table shows the impact of statistic_window on the results. Note that the number of samples = 3421980
statistic_window= sample count
Note that when statistic_window= total number of samples then we get 100% accuracy (i.e. exact value) in the dashboard results.
statistic_window < sample count
When statistic_window < sample count that the last static_window number of samples in the JTL file is used for calculating the latency percentiles and this is the reason why we do not get the exact result. The following diagram shows samples used when statistic_window=20000 (i.e. default)
statistic_window = -1
We can get 100% accuracy (i.e. exact result) in the latency percentiles if we do the above.
In this article, we have discussed the use of latency percentiles as a metric for measuring the performance and how to increase the accuracy of results that appear in JMeter dashboard. We noted that there is a way to get the exact result (i.e. 100% accuracy). This can be achieved by setting jmeter.reportgenerator.statistic_window = -1, i.e. infinite window. However, when you set this property at -1 you may need to increase the amount of memory you allocate for JMeter, in particular, if you have a large number of samples. If there is not enough memory then you can simply increase the default value of this property to higher value which will increase the accuracy of the results. In the article, we investigated the impact of window size on the accuracy.