How Pipedrive Supports Quality Releases while Deploying 50+ Times Per Day

Published in

Pipedrive R&D Blog

8 min readOct 15, 2020

“I look at each day as a chance to move one notch above yesterday — whether it’s in service quality, delivery, speed, or any other aspect of the business.” — Daniel Snyder —

Intro — Speed Vs. Quality

Customers are always on the lookout for solutions that are efficient enough to fulfill the need they have identified and which will also save them money. While companies work to produce something that is efficient and cost-saving, they also want their product to be considered high quality — Pipedrive is no exception. When you develop a product that not only offers top-notch features, but also stands the test of time, then you have a product customers can enjoy and you can feel proud of.

However, we shouldn’t lose sight of another indicator of value, (which can also influence quality) the speed of delivery. The focus on delivery speed comes into play to ensure a competitive advantage. On average, Pipedrive makes about 500 deployments to production per week, with more than 250 developers, and without a dedicated testing department.

The problem is that speed can come at the price of quality. Furthermore, keeping the balance also requires some effort. How do we maintain speed and quality for 10 years? Before we go into that, let’s first discuss the processes we’re following in Pipedrive.

Pipedrive’s “under the hood” processes

To bridge the communication gap among the various teams working simultaneously, we follow DevOps principles in software development. This additionally helps enable faster delivery and feedback.

To adopt the DevOps process while not hindering the teams' release processes, we have nearly eliminated the need for a dedicated testing specialist. In our dev-centric environment, the developers are responsible for testing and deploying their changes. We bridge the gap by relying on Continuous Testing that is focused on test automation.

Instead of providing a safety net to catch failures, we are instead helping teams adapt. For this purpose, we have introduced several specialized teams such as DevOps, SRE, QA Analysts, Support Engineers, Infrastructure Engineers, Agile/Personal coaches, who support all the tribes: product development units aligned around certain product areas.
For example, the Site Reliability Engineering team is focused on collaboration with product and development teams to create scalable architecture while improving the performance, stability, and reliability of our services.

Now that you have a better understanding of what Pipedrive’s setup is like, we will move on to how we ensure fast delivery and built-in quality.

How we’re maintaining delivery speed

A fully automated release process

On average, we automatically test and perform up to 500 deploys per week (you can find out more about how we achieve this in the “Fueling the Rocket for 500 deploys per week” article).

Feature flags/toggles

In our Continuous Delivery system, we want to make new features part of the daily releases. If the feature may still take some time to complete the code is disabled by a feature flag during these daily releases. This allows us to push the code to production incrementally with each release being small and easily managed.

Mission framework

We’ve adopted our own Pipedrive Agile Framework where development is focused around product areas aligned with engineering tribes. One of the benefits of using the tribes is their dedicated focus on a specific feature scheduled for a set time, eventually providing faster results. (You can learn more about the framework from the “Scaling Pipedrive Engineering — From Teams to Tribes” article.)

How we maintain software quality

Ensuring quality in DevOps remains an intimidating problem, but one you will need to invest more conscious effort into for balancing quality.

Here’s one of the possible paths to take:

Controlled rollouts

To phase in and control the continuous delivery of new features we use (as previously mentioned) feature flags. We adopted a phased approach so that just a percentage of the user base can access the feature. If our support tickets and monitoring indicate the initial roll-out was successful, we’ll increase it gradually until 100 percent of the user base have the feature. With this controlled approach, we ensure that new features are introduced to customers with minimal risks.

Testing automation

We rely on Continuous Testing that is focused on test automation. Testing automation enables us to conduct testing quickly and efficiently with fewer employees being involved. Test automation takes place in a CI/CD pipeline and provides fast feedback to the teams which, in turn, enables frequent releases.

Actionable data

We use solid metrics that are tied directly to customer satisfaction. We also collect customer feedback to know more about customer experiences and we track changes in the customer NPS as an indicator of the current state of customer satisfaction, and the potential risk of churn.

In addition to measuring how Pipedrive is used, we measure how effectively we run our projects, how our missions/launchpads are run, and how many more bugs are being created vs. resolved, etc. — taking actions if necessary.

DevOps launchpad metrics dashboard in Grafana

Last but not least, we also track quality-related metrics. Currently, our main measures of quality are as follows:

product stability — indicates the overall stability of the product as a whole to the customer. *The figure is impacted by incident count & duration.
critical incidents — count, duration, and features affected by these incidents
open production bugs — the total amount of such bugs, and the number of such bugs that breach their SLA
new reported cases in Support Engineering — cases reported by customers about the issues they are having.

Of importance, for the metrics we gather, trends are visible and observable — giving us indication of where a particular area is heading which makes data-driven software quality decisions possible. The metrics are collected in Tableau and then shared to a dedicated channel on a weekly basis.

The main panel of our Quality key metrics board in Tableau

All these metrics are a crucial part of getting buy-in across the organization and bringing credibility to quality concerns.

Aligned stakeholders

We make extensive use of actionable data. For instance, weekly engineering operational metrics to relevant stakeholders, such as Engineering Managers. This improves visibility and enables data-driven quality decisions across the organization.

QA Analysts

We adopted the role of the QA Analyst to support the product organization and various tribes to ensure that the application receives sufficient testing in the areas that matter most, identify opportunities for automated tests, and support manual testing needs by organizing bug bash sessions in case of need, etc. Their focus is on organization-wide quality-related initiatives, metrics visualization, analysis, and mission planning.

QA Ambassadors

There is also a dedicated rotating role that brings more focus on software quality into the everyday development process. The role is focused on increasing awareness and interest in quality, driving quality-related initiatives, and helping make better decisions about quality within the tribe.

Zero tolerance for deferment

QA Analysts and Ambassadors advocate for the customer and help to avoid the temptation to defer quality-impactful issues. Sometimes, it can be out of your control due to time-to-market pressures, but whenever possible they encourage the teams to fix issues immediately, to avoid accumulations along the way.

Dogfooding

For some important areas, our employees might be asked to use the pre-release version of the software we make to discover problems and provide feedback.

Definition of "done"

We have a set criteria for defining “done” described in our Release/Pre-landing checklist. We create such checklists alongside the mission/project documentation as a reminder about what should be done during the mission and after the landing to the launchpad, to meet certain technological maturity.

Excerpt from the Pre-landing / Release checklist

To create a suitable checklist from a template, we have added [Create From Template Macro] to a Mission page that creates a child Confluence page.

Clear policies and agreements that advocate for quality

To reach and sustain the desired level of process and software quality, certain policies and guidelines need to be adhered to.

One example of such an agreement is the production bugs management process. Its purpose is to identify and prioritize bugs that are affecting production and align them to ETAs, to make sure that bugs are handled in a timely manner in order to minimize the impact on the business.
It is important to remember that even non-severe issues can have a compounding negative impact on the customer’s perception of quality and degrade product usability.

Conclusion

So what can be done to achieve a balance between speed and quality? There is no simple answer or magic solution, but you can get grasp some ideas from what we are trying to create and from what we haven’t perfected yet:

Testing in production

Testing in production is one of the ways to tilt the balance towards speed/quantity. By releasing software earlier and adopting practices that allow the user to test it, the development can mitigate risk once the software is in operation. Decisions can be driven by data from actual users interacting with production software.

DevOps

Another straightforward solution is to adopt and use DevOps and SRE and it has helped us to work with companies in better manner by reducing time to market, along with bringing more transparency and stability.

Quality culture

A quality culture does matter! According to the Harvard Business Review, companies with a strong culture of quality encounter half the mistakes than those without. This translates into real business value: improving your quality culture can save a lot of money, and as a result lead to higher user satisfaction.

Confidence in the problem

Above all, one needs good common sense. If you’ve validated the problem and proven it to be really important to customers, you shouldn’t take any shortcuts — instead implement the solution with the best quality possible.

Next steps

That’s not all, we are about to introduce new mini-live or region based canary deployments, which will be used to validate new deployments before propagating them to a wider base of our customers. We have also started working on creating quality and testing-related training. There is a lot more to come, but that will be for a future separate article.

Thanks to all the reviewers for their thoughtful comments, and especially Aleksei Suvorov for being a real editor.

Interested in working in Pipedrive?

We’re currently hiring for several different positions in several different countries/cities.

Take a look and see if something suits you

Positions include:

— iOS Developer
— Database Engineer
— Full Stack Developer
— DevOps Engineer
— Software Developer
— And several more