Quality Engineering Paradigm
The goal of this document is to define an approach to delivering quality software, with the following outcomes:
- High confidence that a software deployment will not result in a regression.
- High confidence that monitoring will detect issues in production before clients.
- High test coverage, where tests consist of high value unit tests, large volume integration tests, and fewer integrated and end-to-end tests.
This document proposes a holistic approach to improving software quality.
In order to achieve the desired state it must be accepted that quality is not just the responsibility of quality analysts — it is the responsibility of everyone involved in the software development cycle.
We also need to have a clear distinction between Testing and Quality as they are often confused. Software testing is an integral component to achieving quality in combination with other practices (such as monitoring), tools, techniques and methodologies. Testing can be focussed to areas of change to ensure that software stability and quality is maintained after the change is applied and deployed.
Credits: Cindy Sridharan 🙌
High test coverage increases the potential for high software quality. However, the test coverage needs to be applied at the correct layers to be economical — ie. targeted at the unit, integration and integrated test layers.
Software tests broadly fall into two categories: traditional and production. Traditional tests are more common in software development to evaluate the correctness of software offline, during development. Production tests are performed on a live service to evaluate whether a deployed software system is working correctly.
Use of lean testing models such as the Trophy Model (Credits: Kent C. Dodds). The trophy model attempts to maximize the effective test coverage by distributing tests across the different test layers while minimizing the cost of test implementation. For example, the trophy model suggests more integration tests (which are executed quickly) compared to more expensive integrated tests. It also includes static code analysis such as source code linting.
- Development approach focussing on testability. When architecting and coding a software solution, the testability of that solution must be considered. Architectures and code constructs that do not facilitate testing must be avoided.
- Monitoring and alerting must be considered during implementation. The software infrastructure will have expected bounds within which it is expected to operate. These bounds must be identified, monitored and alerts raised when the bounds are exceeded. For example, a RabbitMQ queue will be expected to never grow above a certain threshold — if it does, an alert should be raised to indicate that there may be something wrong.
- Technology choices should be made to reduce quality risks. For example, consider two microservices that communicate via Kafka. A technology choice that uses Kafka without a schema registry means that contract testing is required to ensure that these two microservices are able to communicate without incident with each other. A technology choice such as Kafka with an avro or protobuf schema eliminates this need for contract testing and reduces the quality risk.
- The proposed approach focuses on traditional test practices. Production tests should be considered with supporting test tenancies in the production environment. Production tests should definitely be added to the test strategy to ensure quality in the production environment and early detection of errors.
- Ensure that sonarqube is enabled on all git repositories. Sonarqube coverage should be shown be visible with entire team.
- Static analysis should be configured to its highest level
- Configure sonarqube quality gate to a high quality bar
- Builds must fail if sonarqube fails
- Ensure that unit tests are of high value and do not ossify the code implementation. i.e. unit tests should not lock in the implementation of a solution, but just test that the expected output is achieved for a specified input.
- Change development focus to implementing more integration tests. These tests must be executed as part of the CI pipeline.
- For integrated (end-to-end) tests, a small set of user journeys should be identified to be tested. The selection of user journeys to be tested. It is also advantageous to limit the number of microservices from different domains to be deployed to satisfy these tests.
- Testability must be considered during solution design. High cohesion and low coupling must be a focus. Tests should be designed during the solution design phase. Testability of the solution design should be evaluated by entire team and should have sign off processes on the solution design before the solution design is accepted.
- Monitoring and alerting must be considered during the solution design. Points of failure should be identified and monitoring designed into the solution design. Unexpected inputs such as multiple failed login attempts, unusual login times, and unknown login devices should also be monitored.
- Technology choices must be evaluated for quality during solution design.
What a QA’s roles & responsibilities in this model?
QA’s would play the role of quality coaches/advocates and would focus on:
- Providing a test and monitoring coverage plan. For example: during the design phase QA would lay out a plan specifying test coverage and monitoring required at different layers of system architecture introduced through a code change.
- Work with developers and product owners to list out test scenarios which are then distributed across different categories of tests (i.e. unit, integration and e2e) . thus allowing QA’s, developers and product owners to be on the same page.
- Automated tests. This involves writing & maintaining a minimal set of e2e tests i.e. at least a test for happy & failure scenarios. QA would work with developers to reassess and ensure that edge cases, and domain specific scenarios are covered at unit or integration test level by the developer.
- Monitoring and alerts QA would work with the team to set up monitoring and alerts decided during the design/planning phase.
- Approving that adequate testing has been performed on any code modifications. Note that it is not the role of the QA to perform the testing but to sign off that adequate testing has been done. It is combined responsibility of the developer & QA to implement the automated tests or carry out the manual testing. For example: Once a pr is raised it’s the responsibility of the developer to demonstrate that sufficient tests have been written across different layers (i.e. unit, integration) QA then reassess the tests and approve or disapprove pr.
- Prototyping new testing approaches and frameworks. As part of continuous improvement QA would explore and introduce latest testing mechanisms/approaches to solidify the overall testing framework.
- Analysing failure patterns such as repetitive issues found during release and exploratory testing , production issues and issues found through monitoring to suggest ways of improving test coverage.
- Creating a quality checklist to assess the quality maturity of domain teams. Assessing that this quality strategy is being followed and evaluating its performance. This may involve:
- Analysing metrics such as MTTR (Mean time to repair) and MTBF (Mean time between failure).
- Identifying tests with false positives or long execution times and working with team to fix them.