Collecting CI/CD Insights: Essential KPIs for QA

João Coelho
5 min readJul 2, 2024
Source

In you career as someone who has impact in the Quality Assurance of your company, you may sometimes feel overwhelmed by the numerous areas that need attention.

One initiative that is often under appreciated, is understanding how your company stands with Continuous Integration and Delivery/Deployment (CI/CD).

Have you ever considered essential KPIs (Key Performance Indicators) such as, builds success rate, duration, which stage is causing the most failures, or even deployment frequencies?

These metrics are vital for establishing a performance baseline, monitoring growth over time, and showcasing progress and results.

In this article, I’ll show you how to retrieve some of these basic metrics, using the specific example of Jenkins, used as a CI/CD server.

Keep in mind that depending on your company, you might benefit from integrating Jenkins with your existing monitoring tools. The steps outlined here are monitoring tool agnostic and involve building a script from scratch.

How can I fetch CI metrics?

The first step is to explore the Jenkins API documentation. However, to save you time, I’ll highlight what’s truly useful. 😄

First and most importantly, ensure you have a valid API token:

  1. Log into Jenkins and click on your profile;
  2. Go to the ‘Configuration’ section;
  3. Click on ‘Add new Token;
  4. Name the new token and click ‘Generate.’

Next, identify useful endpoints — Use the following endpoint to get information on builds executed in a specific branch:

`${jenkinsUrl}/job/${repoName}/job/${branchName}/api/json?tree=builds[number,duration,result,url,timestamp]`

You will get information regarding the several builds that were executed in a specific branch, namely:

  • Duration (in ms);
  • Number of the build;
  • Result (for example, SUCCESS or FAILURE);
  • Full URL for build details;
  • An Epoch/Unix Timestamp (to know when it was executed).

Regarding how to make a request to this endpoint programmatically, is quite simple using axios and JavaScript:

import axios from 'axios';

const response = await axios.get(jobUrl, {
auth: { username: USERNAME, password: API_TOKEN }
});

Once you get the different build’s details, you can measure things like:

  • Mean Duration;
  • Mean Success Rate;
  • Distribution of duration and success rate variations over time.

For this, you would just need to play around with the metrics and perform some calculations.

For the sake of keeping this guide away from tediousness, I will leave this for you as a homework. 😜

Now, if you want to know, for example, which stage in your build is causing the most failures or taking the most time to complete, you will need to get more details.

If you access one of the provided infos retrieved in the previous fetch (Full URL for build details), you can make another request to an endpoint:

`${retrievedUrl}/wfapi`

And with this request, you will get the following information:

  • Overall status, start and end time, total duration, and even pauses duration;
  • An array of stages, each one containing its name, status (SUCCESS/FAILURE), duration, start and end time, and even pause duration.

With this, you can measure additional aspects, such as:

  • Which stage is responsible for the most failures;
  • Which stage is responsible for taking the most time.

With many teams in your company, each potentially using custom names for their pipeline stages, you might find it challenging to draw useful conclusions from your data due to naming inconsistencies.

Go further — create a list of general stage name alias, and for each stage name, assign it their alias.

With this, you can group the several stages from different teams, and get unified metrics!

Now, at this point, you are equipped with the knowledge to get some basic KPIs regarding CI.

How can I fetch CD metrics?

Related to CD metrics, there are some that could be useful as well.

If you want to get all those metrics retrieved for CI and shown earlier, you can do it as well here.

Keep in mind that you may not use the same URL as before, as now you want to access the builds associated with release tags that were created during their execution.

Examining these tagged builds is crucial for collecting CD metrics, as typically, the creation of a tag triggers the corresponding pipeline for deploying to production.

The endpoint goes as follows:

`${jenkinsUrl}/job/${repoName}/view/tags/api/json`

If you perform a request to this endpoint, you will retrieve information, such as:

  • Name of the tag;
  • Full URL for build details;
  • Result, that could be represented as a color: ‘notbuilt’, ‘blue’ (success), ‘red’ (failure), and many others.

With this, you can only get information regarding overall success rate.

If you want to dig deeper, and get details regarding the builds in each tag, re-use the previously fetched ‘Full URL for build details’:

`${retrievedUrl}/${buildNumber}/api/json`

Normally, the value of that ‘buildNumber’ is 1, since only one build is used for each tag. But in case you are not sure of that, you’ll have to iterate until you find the latest one that does not return a 404.

Using this endpoint, you’ll get access to information such as:

  • Artifacts;
  • Duration (ms);
  • Result (for example, SUCCESS or FAILURE);
  • An Epoch/Unix Timestamp (to know when it was executed);
  • And many others…

With this, and iterating through each tag, you can get a sense of:

  • Mean Duration;
  • Mean Success Rate;
  • Distribution of duration and success rate variations over time;
  • Deployment Frequency.

The ‘Deployment Frequency’ can be gathered by simply having the date range you are considering.

If, for example, you are fetching data from the last 2 months, divide the total number of deployments, by the number of months, weeks and days, and retrieve the respective deployment rate.

Also, if you want to get more details related to the build stages, you can use the same endpoint described before, already with the build number, and access the same parameters:

`${buildUrl}/wfapi`

Other valuable KPIs, such as Mean Time to Recovery or Lead Time, can also be captured and calculated, though they won’t be addressed in this article.

Additionally, there are various methods to present this collected information, from Excel to Looker and more.

Keep an eye out for future posts related to these two topics! 👀

Conclusion

Understanding and leveraging Continuous Integration and Continuous Delivery (CI/CD) metrics is crucial for improving the quality assurance processes within your company.

By effectively monitoring key performance indicators such as build success rates, duration, failure stages, and deployment frequencies, you can establish a baseline for performance, track growth, and demonstrate tangible progress over time.

Using tools like Jenkins and making use of its API, you can efficiently fetch and analyze these metrics to gain valuable insights.

This approach not only enhances the reliability and efficiency of your CI/CD pipelines but also contributes to delivering high-quality software products that meet user expectations and business goals.

I hope you enjoyed reading this article!

My name is João Coelho, and I am currently a QA Automation Engineer at Talkdesk. Lately, I have been writing articles regarding automation, QA and software engineering topics, that might not be known by the community.

If you want to follow my work, check my Linkedin and my author profile at Medium!

Furthermore, if you’re interested in further supporting me and my content creation efforts, you can do so by buying me a coffee! 😄👇

Your support goes a long way in helping me dedicate more time to researching and sharing valuable insights about automation, QA, and software engineering.

--

--