Performance Testing: Metrics and Measurements

6 min readMar 1, 2023

Accurate measurements and metrics are important for evaluating the results of performance testing.
Since performance testing plays a key role in the success of a software application, it should not be initiated without first understanding which measurements and metrics are needed.
Typically, performance metrics are derived from requirements analysis. But, very often, the performance requirements are not defined or are not expressed in measurable terms, so performance testing is frequently not performed against a specification and is frequently used to measure and assess the application’s performance profile and limitations.
Performance metrics vary from application to application: for example, the metrics chosen for the performance testing of an e-commerce
website will differ from those chosen for the performance testing of an embedded IoT system for home automation. So, the performance measurements and metrics vary according to the application’s technical and operational environment or the business domain.
Defining the right set of metrics that work best for the given project/application is critical for successful performance testing and monitoring. It should be noted that collecting more metrics than required is not a good thing. Each metric chosen requires an effort for collection, analysis, and reporting. It is important to define a reasonable and obtainable set of metrics that support the performance test objectives. Last but not least, since different performance metrics provide different perspectives, performance metrics must be aggregated to understand the total picture of system performance. When performance metrics are only viewed singularly, drawing the right conclusion may be difficult and error-prone.

Performance Metrics

“You cannot manage, what you cannot measure.”

Performance testing measures the speed, bandwidth, reliability, and scalability of an application under some load. The purpose of performance testing includes checking if the system can handle expected continuous or peak loads and detecting performance bottlenecks.
One of the main goals during performance testing is to define the metrics that the system must meet. Testers will then use these metrics to evaluate system performance. The choice of performance testing metrics always depends on the application under test.

Below there are the most frequently used performance metrics with some examples.

CPU utilization: the percentage of available CPU used (including average and peak CPU utilization).
Figure 2 shows the CPU utilization over time. At first view, one can conclude that the CPU operates at 25% of total CPU capacity during a non-peak time and at 75% of the capacity during a peak.

But, as said before, performance metrics must be aggregated to avoid wrong conclusions. If we aggregate the relevant load chart (Figure 3) we see that the peak of CPU utilization does not correspond to a peak load. This requires further investigation…

Memory utilization: the percentage of available memory used. If the amount of memory used is unusually high, then it indicates that there are memory leaks. A memory leak occurs when a process allocates memory but doesn’t free it, causing a gradual deterioration of the system’s performance.
Figure 4 shows a healthy chart regarding heap utilization, coupled with the continuous increase (leak) of the total memory used by the (Java) process.

Response time: the time from the moment when a given request is sent to the system until the moment when the last bit of the response has returned. Testers usually monitor Minimum Response Time, Average Response Time, Peak/Maximum Response Time, etc. The response times change under different loads (the number of concurrent users, the amount of data processed, etc.) imposed on the system.
For many stakeholders, the main concern is that the response time is within acceptable limits.

Load time: the time required to start an application.
Page Load Time: the amount of time a web page needs to show on the screen.

Figure 5 — Response Time vs. N. of Users

Figure 5 shows the response time of two systems by varying the number of concurrent users.
With <200 concurrent requests, the response times are very similar between the two systems.
With >200 concurrent users, system A starts deteriorating the response times, while system B holds up to 2.000 concurrent users.

Throughput: the number of transactions of a given type that the system can handle in a unit of time (e.g., the number of HTTP requests per second).
More than the number of concurrent users, the throughput defines the load on an interactive system. It represents the ability of a system to handle a heavy load. The higher the throughput is, the better the application performance.

Figure 6 shows the throughput (the number of requests processed per minute) of an application server. The dotted line identifies the point at which adding more concurrent users reduces the number of requests that can be processed per minute. This point indicates when performance starts to degrade.

Latency: the duration that a request is waiting to be handled. Latency refers to the delay in the system: don’t confuse it with the response time that includes both the delays and the actual processing time (Latency + Processing Time = Response Time)[Figure 7]. Low latency implies that there are no or almost no delays. High latency implies that there are many delays.

Bandwidth: the amount of data that can be transmitted and received per unit of time. For instance, if a network has high bandwidth, this means a higher amount of data can be transmitted and received. But high bandwidth doesn’t necessarily guarantee optimal performance. If, for example, the throughput is being impacted by latency, the tester will experience some delays even if there is high bandwidth availability.
Bandwidth graphs, as in Figure 8, can be used to obtain a general idea of the traffic and the amount of data transmitted and received, as well as to provide an alert to any sudden spikes or troughs.

Error rate: the percentage of requests resulting in errors compared to the total number of requests. These errors are most experienced when an application reaches its threshold limit or goes beyond (& cannot handle any more requests) [Figure 9].

Figure 9 — Error Rate (x version) vs. N. of Users

Several graphical applications and command-line tools are available for gathering performance metrics. Most provide a visualization (tables, charts, etc.) that breaks down response times, throughput, error rates, and other performance metrics. These tools can effectively support the root cause analysis of performance issues in the system under test.

Top Performance Testing Tools include Apache JMeter, LoadRunner, LoadView, NeoLoad, WebLOAD, etc.

Conclusion

Testing the performance is critical to ensure a good user experience: a slow application that keeps crashing will frustrate the users, ultimately causing them to abandon the application. Successful performance testing requires defining metrics and measurements to set goals for performance testing and evaluate the results. These metrics include the Response Time (e.g., per transaction, per user, page load time), the Resource Utilization (e.g., CPU, memory, network bandwidth, network latency), the Throughput (i.e., the number of transactions processed in a given time), the Numbers of Errors impacting performance, etc. Selecting the right metrics based on the technical, business, or operational requirements and aggregating the results are two principal factors for effective performance testing.

Performance Testing: Metrics and Measurements

Performance Metrics

Conclusion

Written by Vincenzo Cuomo