How We Built Our Own CI/CD Framework From Scratch (Pt. I)

Jorge Claro
Pipedrive R&D Blog
Published in
5 min readMay 6, 2020
[From stratabeat.com]

Even though Pipedrive has a large group of experienced professionals, we’ve still felt the pain of technical debt in a few areas and our CI/CD pipelines (Continuous Integration & Continuous Delivery) are no exception.

Problems with the CI/CD Framework:

  • Complex, long and hard to maintain
  • Only maintained by our DevOps team
  • Implemented in Groovy and highly dependent on both Jenkins and its plugins
  • Doesn’t contain any strict convention regarding code style — no linting tools or even error handling guidelines
  • Lacking on both unit and functional tests — neither properly versioned

The current state of the framework doesn’t make it particularly appealing for any developers to help contribute, even the more experienced ones.

Should we refactor it?

At first we thought about refactoring the pipelines into several Jenkins libraries that could be tested, shared and reutilized as needed. To do so, we would need to properly configure an IDE with all the tools and dependencies to fulfill the needs referred to above, but after some trials, getting it to work properly ended up being more difficult than expected. As it turns out, refactoring would end up making us even more tightly dependent on Jenkins specifics and thus make it even harder to move away from it in the future.

Are there any solutions on the market that could help us?

We took a look into the market and saw how other companies were dealing with this issue. What we found was that most of the choices were quite expensive and/or limited without options to run them on-premises which further shortened our range of options. It was becoming more and more difficult to make a decision based on all of these factors, that is, until some of us raised an interesting question…

How hard would it be to implement from scratch our own CI/CD framework built to directly fit Pipedrive’s needs?

When you first begin to think of doing this from scratch, it sounds kinda crazy right? Why would Pipedrive spend resources in both research and development just to reinvent the wheel when there are already so many alternatives on the market that could fit these needs. Would the benefit ever outweigh the effort?

To answer this, let’s pinpoint some decisive factors:

  • Infrastructure would be 100% containerized (excluding critical resources)
  • Pipedrive has the people with the required skills to pull this off
  • We can tailor the framework according to our specific needs instead of adapting it to the limitations of external tools
  • We can run it on-premises without paying any kind of service subscriptions

After some weeks of preparation and discussion, we decided to start the mission (as we call it) and started with research and implementing a proof of concept. We came up with a few base requirements that would define the core of this project, such as:

  • Fully containerized: every command run as an independent block
  • Support execution of sequential and parallel commands
  • Configured as code
  • Exact logging with only relevant information, with the option to look into the details if necessary
  • Testable at both unit and functional levels
  • Support templating, but also be customizable as needed
  • Automatic and secure credential handling for each individual command execution

We know that containers allow us to execute sequential and parallel commands, isolate the processing scope, and even control the I/O & performance of each of them, but to accomplish this we would need to provide a convenient way for developers to set those configurations and pass them accordingly to the docker daemon API.

Together with container specifications, we also need to configure the flow of execution. After doing some research into pipeline configuration structures, we decided to keep things simple and limit the complexity of our engine to these three main concepts:

step — the most unitary element of the pipelines, with a direct relation to containers. Steps should not be tied to a specific instance of the flow, but instead considered a configurable shared element with the specifications referred to above.

stage — a group of steps that should be run in parallel as part of the flow definition.

pipeline — a group of stages that should be executed in sequence, considered as the complete flow.

Expected results

The results of the pipeline should be dependent of the result of their stages and steps accordingly, acting with a fail fast approach. If any of them fail, the flow should stop and be considered as unsuccessful.

Logs should only provide relevant information as immediate feedback. In most cases it will be enough to just know in what step or stage of the pipeline the error occurred, and where to find the correspondent log parcel. Of course, fully detailed logs should still be available, but not presented in the first debugging method.

Testability of the pipeline engine should be done not only at the unit level based on the code implementation, but also at the functional level. A predefined set of known test flows should be completed successfully so we can consider the pipeline engine healthy.

Credential handling was one of most difficult requirements to plan out. The credentials shouldn’t be logged anywhere in any form, nor be included as part of any docker image layers. To achieve this, we decided that credentials should only be fetched and injected on containers, during the start process of each step.

At this point, we considered that the core of our engine was well defined and we could start to implement a proof of concept — an engine that will fire up multiple docker containers in sequence or in parallel, which lead us to name it “Fregatt” (frigate in English).

NRP Vasco da Gama in Tallinn, March 2008

Read Part II, where we cover more details into how it works and how we develop, version, and test engine updates without breaking the Pipedrive production pipelines.

--

--

Jorge Claro
Pipedrive R&D Blog

DevOps Engineer @ Critical Techworks | BMW Group, Microcontroller Hobbyist and Scuba Diver