A Deep Dive into Quality Metrics

Measuring What Matters Most

Damian Moga

Published in

Globant

8 min readMay 29, 2023

Introduction

The purpose of this article is to provide an overview of the main quality indicators used in most projects. These quality indicators are usually divided into the following three areas:

Indicators for the quality assurance process
Indicators for incident management in production
Indicators for measuring a culture of quality

It is important to note that there may be other quality indicators that are specific to a particular project, product, or quality objective.

Quality assurance process indicators

Quality indicators can be divided into those that measure the coverage of functional and/or non-functional tests, those that measure quality based on detected defects and their status, priorities, and criticality, and other indicators related to the time to fix such defects and overall user satisfaction. The following categorization is a non-exhaustive list of the most common and important indicators.

Test Coverage: It can be measured using various criteria, depending on the type of test. These metrics provide insights into the evolution of tests within a team and product, as well as understanding the coverage in terms of lines of code and user stories. Additionally, the level of automation indicates improvements in overall quality activities and reduces certification time. (a) In unit tests, it is crucial to measure compliance with established quality gates to ensure that the code functions correctly at its most atomic level and adheres to unit testing guidelines. Acceptance tests assess coverage by considering the number of test cases available for each user story. (b) They are typically organized by test type (such as smoke, sanity, regression, integration), priority, and whether they are manual or automated tests. These tests encompass end-to-end scenarios from the UI level and API testing scenarios from the backend side.
Test Execution Time: This indicator measures the time required to validate a user story, taking into account the number and type of tests performed, as well as the time required to re-run tests in case of finding defects. As the level of manual testing increases, the time invested in ensuring quality and delivering functionality in production also increases. The time for automated testing versus manual testing is considered with the objective of optimizing time and performing tests early, either at the beginning of the development cycle or through continuous testing throughout the process.
Code Complexity: Code quality can be measured using different indicators, including lines of code, best practices and development techniques, and the number of complexity cycles. Static code analysis tools are used to obtain this information, allowing the identification of software quality problems such as vulnerabilities, and “code smells,” among others. Some of the best-known tools are Sonarqube, Checkmarx, PMD, Codacy, and Deep Source, among others.
Defect Ratio: This indicator measures software quality based on the number of defects found and classified according to their environment, priority, severity, and status, focusing on critical defects or those detected in production. A low number of critical defects indicates higher software quality and faster capacity to resolve them. From this indicator, and if there is a large volume of critical defects, a “zero bug” policy can be implemented that establishes that if a critical defect is found, it must be addressed and resolved immediately.
User Satisfaction Index: This indicator measures the level of user satisfaction with the system. A high satisfaction index indicates that the software meets user expectations. Different methods can be used to measure the user satisfaction index, such as surveys, complaints, evaluations in application stores, usability tests, and the percentage of critical defects detected by the user. It is important to use tools to monitor and understand user behavior to refine or plan tasks that can enhance the user experience and thus their satisfaction, such as google analytics, google optimize, or Hotjar.
Technical Debt: This indicator measures the number of pending activities that are related to software quality or any other activity tied to this process. Activities such as reviewing pending coverage for existing features in production, increasing automated test coverage, performing maintenance, refactoring, optimization and automation of manual processes, reviewing tests, reducing “flaky tests,” and improving the quality of reports, among others. Technical debt is important to track, as it can slow down the pace of software development and increase the likelihood of defects and system failures.
Code Reviews: By tracking the number of issues discovered during code reviews, teams can gain insights into the overall health of the codebase. A higher number of issues may indicate potential quality concerns that require attention. Additionally, categorizing the severity of identified issues allows teams to prioritize and address critical problems promptly. To effectively track code review feedback, teams can use tools or platforms that facilitate the review process and capture feedback in a structured manner. This allows for efficient issue management, follow-up actions, and progress monitoring.
Non-Functional Test Coverage: Depending on the project and product, quality indicators can be established to measure the status and evolution of this type of test, its compliance, and its adoption in the program. Among the indicators that can be mentioned are:

With Accessibility tests, we measure the alignment with WCAG guidelines, the most commonly followed accessibility standard. These tests assess the level of accessibility achieved, utilizing a scale ranging from 1 to 3. However, it is crucial to recognize that specific countries may introduce their own guidelines to be taken into account. For instance, the United States has Section 508, the European Union adheres to the European Standard EN 301 549, and ARIA serves as a complementary resource to WCAG.
With Security tests, we measure the quantity, severity, and type of vulnerabilities found in the system.
With Performance tests, we measure the overall performance of the system and its components, as well as the resource consumption per component to optimize its scaling and/or reduce infrastructure costs, and also survey and measure errors related to the handling of system concurrency.

Incident management indicators for production

There are various indicators used to measure the effectiveness of incident resolution processes in production. These indicators are interconnected and generate a specific workflow for handling incidents while also measuring the time taken for each phase and comparing it to the expected results.

To ensure consistent and reliable performance of your services and applications, it is important to focus on and optimize the following metrics:

Mean Time to Detection (MTTD): This is the time it takes to detect an issue that requires attention. By minimizing MTTD, you can quickly become aware of potential issues and take the necessary actions to address them promptly.
Mean Time to Acknowledge (MTTA): MTTA measures the average time it takes for an alarm to initiate an action, such as issuing a service ticket. A low MTTA indicates a quick response from the IT operations team and ensures that problems are identified immediately.
Mean Time to Response (MTTR): MTTR indicates the average time it takes to begin processing a service ticket after it has been acknowledged. A shorter MTTR means faster response times and a more proactive approach to resolving issues.
Mean Time to Repair (MTTR): This metric measures the time it takes from the detection of a problem to the resolution of the underlying problem. A lower MTTR indicates efficient troubleshooting and repair, resulting in less downtime.
Mean Time to Resolve (MTTR): MTTR reflects the time required to fully resolve an issue and perform thorough testing to ensure the associated system is functioning properly. A low MTTR ensures that problems are resolved effectively, reducing the likelihood of recurring incidents.
Mean Time to Recovery (MTTR): MTTR measures the time it takes to return the associated system to full operation. Minimizing MTTR helps recover quickly from disruptions and restore normal service levels.

By actively monitoring and optimizing these metrics, you can improve the health, availability, and reliability of your services and applications. As a result, you can more effectively achieve your mission objectives and ensure a positive user experience.

Indicators for Measuring a Culture of Quality

Several key indicators can be used, including Maturity Levels, Onboarding Time, Post-mortem Incidents in PROD, and Adoption of DevOps and Continuous Improvement Practices. These indicators provide valuable insights into the organization’s level of maturity and ability to maintain quality standards.

Maturity Levels: This indicator measures the progress of teams in adopting and implementing quality processes through a maturity-level program. These levels, ranging from Initial to Optimized, indicate the team’s adherence to predefined quality KPIs aimed at fostering continuous improvement. By providing a distribution and overview of teams based on their maturity level, this metric offers insights into their progress. Maturity levels can be established at the project’s inception or adjusted as the project evolves in terms of testing or when the business impact affects delivery objectives and quality.
Onboarding Time: This indicator measures the time it takes to onboard team members on the concepts and processes associated with product quality, including training materials, documentation, and sessions. A shorter onboarding time indicates a solid foundation of guidance, support, and training, which is critical for establishing a quality culture.
Post-mortem Incidents in PROD: This indicator measures the effectiveness of the root cause analysis process for incidents that occur in production environments. It ensures that corrective actions are taken at the functionality, infrastructure, or quality testing levels to prevent a recurrence. The process involves addressing the lack of coverage or specific testing types and is critical for maintaining the quality of the product.
Adoption of DevOps and Continuous Improvement Practices: This indicator measures the degree of adoption of DevOps practices and the level of continuous improvement in the organization. It provides a view of the volume of automated vs. manual processes, the level of continuous integration, the frequency of releases to production (considering autonomous team releases or joint releases), and the level of monitoring and observability. Continuous improvement is crucial for maintaining a high level of product quality and increasing efficiency in the development process.

SLA, SLI, and SLO

It is also common to establish and use tools such as SLA, SLI, and SLO to ensure that the established quality standards and service levels are met, setting clear goals with a focus on continuous improvement.

These terms are used in conjunction with the indicators mentioned above to measure and improve quality.

Service Level Agreement (SLA): It is a contract that establishes the services that will be provided and the quality standards that must be met, setting clear expectations between the service provider and the customer.
Service Level Indicator (SLI): It is a metric used to measure the compliance of a service with respect to the standards defined in the SLA, measuring the performance of a service in terms of quality and service levels. Generally, real-time monitoring and tracking tools are used.
Service Level Objective (SLO): These are used to ensure that the quality standards defined in the SLA are met. SLOs are established using SLIs as a reference.

Conclusion

In conclusion, the use of quality indicators enables companies to improve the quality of their software products, increase user satisfaction, and reduce development costs. By carefully monitoring these indicators throughout the software development lifecycle, teams can pinpoint areas for improvement, measure progress toward quality goals, and make decisions based on data. It is critical for software development teams to identify relevant quality indicators tailored to their specific project, product, or quality goals and seamlessly integrate them into their quality management processes. In this way, they can foster a culture of continuous improvement and ensure the delivery of high-quality software solutions.