Definition of CI
Martin Fowler defines Continuous Integration (CI) as follows:
Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily — leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.
Why does it matter to DevOps? Again, it’s related to velocity — being able to merge frequently is the foundation of deploying, gathering feedback, and improving frequently. CI is the foundation of DevOps. Without CI tools, an important part of the lifecycle of software development couldn’t operate smoothly, error-free, in an automated fashion.
What Does CI Do Exactly
CI is just a tool for executing some automated tasks, which can be integrated with source code management systems and can trigger builds and tests automatically based on certain criteria, execute some configurable commands, and maybe reporting back the build/test status to the source code management system.
From the short description, you can already feel there are two major benefits of CI:
- Increased productivity. It allows your team to integrate your work frequently.
- Quality. CI tools also help you to keep your quality standard high because it runs builds and tests every time you push your commit (to be precise, this is not 100% true because you can configure when exactly it runs instead of running on every commit.)
In DevOps, we want both productivity and quality.
There isn’t anything fancy about CI, though. Basically, it’s just a highly configurable tool so that you can define different pipelines and steps. In each step, you can run some commands to achieve basically anything.
For example, as one of the last steps of your pipeline, you can do some
kubectl apply or
helm install so that your code is deployed to a Kubernetes cluster. Although, there is a tendency to do deployments with a separate tool: Continuous Deployment (CD) tools.
Using CI in combination with CD tools, you can test your code in multiple environments with different configurations, trigger additional tests (like performance test in one specific environment and end-to-end test in another, etc.) Every step can be automated all the way to production.
Mostly Used CI Tools on GitHub
This is an analysis done by GitHub, according to the mostly used commit status contexts, to find out what CI tools are mostly used with GitHub. The top three are Travis CI, Circle CI, and Jenkins.
Although this analysis was done back in 2017, I would still consider this result relatively meaningful. If you try to search Travis CI, Circle CI, and Jenkins in GitHub, you will find tens of millions of results in the code, while other choices are relatively less used.
At the end of 2020, GitHub Actions (which is the CI created by GitHub) became generally available, and it is a rising star. If you try to search GitHub Actions configuration files on GitHub, you can already find more than one million results.
These top four options probably can cover anything you want to do with a CI tool.
Deploy Your Own CI On-Premise?
As a rule of thumb, you probably don’t want to do this, especially in an agile, DevOps setup. The major reason being, maintaining the CI tool itself brings operational overhead.
You might think running your own CI is cheaper because, for example, if I run a Jenkins myself, I don’t need to pay anything. While there isn’t any bill on Jenkins itself, there are other operational costs, like maintaining, upgrading, backup, making it highly available, etc. These operational costs increase the total cost of ownership.
The team’s time and effort maintaining the CI itself can be used to invent and implement something else, something critical for the business, something that could generate income.
There are some regulations and restrictions in some businesses, though, where you might be forced to run your own CI; this is an exception. Another exception being, if you are a huge corporation, investing in a dedicated team maintaining the CI for your global audience might be well worth it. Whether you want to use Travis, Circle, Jenkins, or GitHub Actions, it’s possible to do an on-premise deployment.
No matter which CI tool you choose, you are locked in. This is a fact.
Modern CI tools are not terribly difficult to learn, though; if you really decide to move to another CI tool later, the cost isn’t that huge. So the lock-in issue isn’t significant.
Nowadays, public cloud providers also provide CI tools. If you do not operate in a multi-cloud setup, it’s probably worth a try too. But if it’s less easy to use (for example, the graphic user interface) or has less user base (for example, if you met an issue, you couldn’t find much help from the community), I would think twice before using it.
Choosing the Best CI
CI is just a tool. The goal is to increase your productivity. If using the tool is too complicated, it reduces your productivity. A good CI tool should be easy to use, first and foremost.
Luckily, all modern CI tools use YAML configuration, which is relatively easy to set up and understand. However, there are still some crucial aspects that might affect your productivity with a certain CI tool:
Ease of Integration
A good CI tool should integrate well with the source code management system of your choice, with minimal configuration.
If you create a new repo every time you need to set up some webhooks so that the CI tool can push build status back to your pull request, it’s too much operational overhead. Especially if you have a microservice architecture where you have maybe like 50 services and the number is still growing continuously. Although you can mitigate this by creating some automation which creates repo-CI integration, it is still some operational overhead.
CI tools interact with source code management systems by definition. Because CI tools integrate the code, which is in the source code management systems. The most commonly used task in a CI tool probably is to checkout. Checking out a repo, checking out another repo, checking out to a specific branch, etc. Then, sometimes the CI tools need to push something back to the repo. Generating a new tag and pushing the git tag is the most common scenario. If you need a lot of effort to configure your source code management system credentials for this checkout/push to work, it’s too complicated.
Ease of Maintenance
Normally you don’t just create one pipeline for your code. You create many pipelines. For example, one pipeline only runs certain tests for a pull-request. Another pipeline only runs on the master branch to run more tests so that the master branch’s status is always green, and it is guaranteed that no matter when you check out from the master branch, it is usable. You might create yet another pipeline that only builds in a certain environment and runs certain tests, like end-to-end tests or performance tests. You get my idea: you need, and you will have multiple pipelines.
If all the pipeline definitions must be stored in one configuration file, this file grows. This file will grow into hundreds of lines, and because you need certain criteria for certain pipelines, like where to run them and when to run them, you will have a lot of if/else in your pipeline definitions. In the end, this single configuration file becomes a monolith that is very long, not easy to read or maintain, and not easy to understand because of the complicated logic.
Splitting different pipelines into separate files is crucial for big, complicated projects. Otherwise, the CI configuration itself will give you a headache.
From this standpoint, both Travis CI and Circle CI don’t support this yet. Jenkins and GitHub Actions do.
There are probably more than 20 popular CI tools, if not more, out there on the market, and personally, I have tried and done some investigation in quite a few of them, but there isn’t a need to try everything before deciding which one to use.
A general rule of thumb is, stick with popular choices. When you are using a tool, you will have some moments when you want to use it to do something, but you don’t know exactly how. In moments like this, a large user base helps a lot because the chances are, you are not the first person who doesn’t know how to do that. So with a simple search, you can find some existing help and solutions from the community. Community support is crucial to keep your productivity high.
As of 2021, I have already switched to GitHub actions, and I’m happy about it. See some of my examples here. Give it a try too!