How We Deliver Happiness to Our Users: A Look Inside Our Quality Process

Sanyasi Naidu Annepu
Harness Engineering
8 min readApr 22, 2023

At Harness, our goal is to deliver the best possible experience to our users. We believe that quality software and an unwavering commitment to excellence are the keys to achieving this goal. In this blog post, we will take you behind the scenes and explain how our development cycles, pull request checks, and testing & release procedures ensure that we maintain the highest standards of software quality when moving code to production.

Let us give you a brief overview of our development cycle before we delve deep into the quality assurance process we follow.

Development Cycles

In order to provide the best possible user experience, our development process goes beyond the Agile methodology and two-week sprints.

Plan

Each quarter, our team sets strategic goals and objectives that align with the overall company vision and priorities. This helps ensure that our development efforts are focused on what matters most to our users and stakeholders. Customer requests are an invaluable source of information for our product team. We collect feedback through various channels, such as support tickets, customer interactions and user surveys. Our product team carefully reviews and categorizes these requests, using them to inform our product roadmap and prioritization decisions.

Develop

Our engineers follow a structured approach to ensure that each feature or improvement is designed, validated, and tested thoroughly before being released to production. This process involves several steps:

  1. Design: Engineers collaborate with product managers and designers to create detailed design documents that outline the feature’s requirements, user stories, and technical specifications. This ensures that all stakeholders have a clear understanding of the feature’s goals and objectives.
  2. Validate: Our engineers validate use cases by collaborating with product managers and customer success teams to ensure that the feature addresses real-world user needs and fits within the context of our product’s overall user experience.
  3. Code: Engineers write clean, maintainable, and efficient code that adheres to our coding standards and best practices. They also conduct regular code reviews to ensure that the code is free of errors and adheres to the agreed-upon design.

Quality

At Harness, quality is not an afterthought; it is deeply embedded throughout the entire journey. We are dedicated to adopting a shift-left testing approach at every stage of our development process. By implementing comprehensive PR checks, we endeavor to identify and address potential issues in their nascent stages, ensuring that our codebase remains dependable, secure, and maintainable.

PR Checks

Our PR process includes a variety of checks. Any of these PR check failures just means that the developers cannot merge their code to the master. These can be categorized as below

Quality checks

  1. Unit Tests — At Harness, we maintain a comprehensive suite of over 30,000 unit tests, which are executed against each PR. By leveraging Harness Test Intelligence, we significantly reduce test execution times. As illustrated below, executing approximately 2,500 tests out of the available 30,000+ tests resulted in substantial time savings, thereby enhancing developer productivity. To gain further insights into our Test Intelligence offerings, please visit here
  1. API backward compatible checksHarness APIs conform to OpenAPI Specification 3.0, recognizing that numerous customers rely on our APIs and have built extensive automation around them. In fact, the Harness UI is also a consumer of our APIs. We are committed to avoiding the introduction of backward incompatible changes to our APIs, as such alterations could potentially disrupt our customers’ operations. By implementing these checks, we aim to achieve this objective and ensure a consistent, reliable API experience.
  2. Code Coverage — At Harness, we utilize Sonar to determine code coverage metrics for every Pull Request (PR). By comparing the coverage results against the baseline code coverage, we ensure that the code adheres to our quality standards. If the code coverage falls below the threshold configured in Sonar, the developer is prevented from merging the PR until the code meets the required coverage. This approach ensures that we maintain high code quality and minimize the likelihood of introducing potential issues or technical debt into our codebase. Some of our modules coverage reports are publicly available here
  3. Code Reviews — A minimum of two team members conduct a thorough PR review, examining logic errors, compliance with coding standards, maintainability, and performance factors. This evaluation process aids in preserving high-quality code while minimizing the probability of introducing bugs or accruing technical debt

Security checks

  1. Vulnerability scans — Harness, an open-source platform, relies heavily on numerous open-source libraries. We provide both SaaS and Self-Managed Platform versions to our customers, with customer security being our utmost priority. As such, it is crucial to have a vulnerability management process in place. By utilizing our STO product, we can effectively shift this process to the left. Security testing tools, such as Static Application Security Testing (SAST) and Dynamic Application Security Testing (DAST), can be seamlessly integrated into the PR process. This integration aids in identifying potential security vulnerabilities and ensuring adherence to security best practices within the application.The following is a representation of our actual pipeline.

Any Critical/High vulnerabilities identified in this step get auto-filed by STO as a Jira issue. QA deployment of the code needs escalated approval at this point.

  1. Git leaks — Mitigate the risk of data breaches and security incidents by proactively identifying and securely removing sensitive information

Hygeine checks

  1. Code format, PMD & check style
  2. Automated linting
  3. Static code analysis checks

Tools like ESLint, Prettier, cLang, and SonarQube are integrated into the PR process to check the code for syntax errors, formatting inconsistencies, and potential issues, such as unused variables, vulnerabilities, or code smells.

This robust set of PR checks enables us to maintain the highest level of quality while minimizing the risk of introducing bugs or vulnerabilities to our software. By embracing shift-left testing in every process, we aim not only to deliver a superior user experience but also to foster a culture of continuous improvement and excellence throughout our organization.

Release

At Harness, we have developed a robust system release process to accommodate the diverse release cadences of our 50+ microservices, ensuring that each service is thoroughly tested and validated before deployment to production. Our process is designed to be flexible and efficient, catering to both independent release cycle services and those with a weekly cadence.

Daily Deployment to Pre-QA Environment

Daily, we deploy the most recent code updates for all microservices to a pre-QA environment, initiating a comprehensive execution of our service automation suite. This routine deployment facilitates the early detection of potential issues, enabling us to uphold stringent quality standards across all services. Upon the completion of automation testing, the results are disseminated via Slack, providing the development team with an opportunity to review and address any identified concerns.

Deployment to QA for E2E Validation

Upon obtaining a pre-QA sign-off, we gain the confidence to transition our branch to the QA environment at any time, contingent on the release cycles of the respective services. The development team holds the authority to deploy the build to the QA environment, where an extensive set of automation tests are conducted to ascertain that the service aligns with our quality benchmarks. Depending on the service type, pertinent teams will initiate automation for all dependent services to ensure that new modifications do not disrupt any existing service flows. This process, akin to our pre-QA stage, is entirely automated, with Slack notifications sent to teams in the event of test failures. Additionally, Jira tickets are generated, requiring urgent attention and resolution. If the tests prove successful, the team can opt to deploy the service to production, delivering the latest features and enhancements to our users.

Pre-Prod Environment for Core Platform Services

For core platform services, which have a more significant impact on the overall system, we employ a more rigorous release process. We create a pre-prod environment that mimics the production environment, with all production service versions and the latest version of the core platform service. In this pre-prod environment, we run all automation tests to validate the service’s stability, performance, and compatibility with other services.

Once the core platform service has passed all tests in the pre-prod environment, it is deemed ready for deployment to production. This thorough testing and validation process ensures that our core platform services are reliable and provide a solid foundation for the rest of our microservices.

Deployment to Production & verification

At Harness, we place a great emphasis on templates to promote reusability and standardization. To this end, we leverage a single, highly refined template, which we refer to as our “golden template,” to deploy all of our services. The golden template is equipped with various pre-requisite checks, mandatory security checks, manager approvals, and canary and rollout deployment strategies, all of which are designed to ensure the reliability and security of the service. In addition, we perform post-deployment checks to verify that the service is functioning as expected. A quick peek into the template

Our quality assurance process is continuous, extending beyond deployment to production. As services are deployed, we leverage our CV & SRM products to the stability of new deployments, promptly notifying relevant teams. Simultaneously, we initiate sanity suites for all services in production following deployment. Test failures result in the creation of high-priority Jira tickets, prompting respective teams to take immediate action. This approach enables us to swiftly roll back service if any issues are detected, mitigating customer-facing degradation and ensuring a seamless user experience.

SLO Monitoring

By following this flexible and thorough system release process, Harness is striving to maintain the highest standards of quality and performance across our diverse range of microservices, ensuring that we consistently deliver exceptional products to our users.

--

--