DevOps (a clipped compound of “development” and “operations”) is a software development methodology that combines software development (Dev) with information technology operations (Ops). The goal of DevOps is to shorten the systems development life cycle while also delivering features, fixes, and updates frequently in close alignment with business objectives.
One way to achieve this is by implementing a CI/CD pipeline so let me take you through my first interaction with a real job in the software engineering industry as part of a placement year (known also as ‘had enough uni, need a break’). These words didn’t really mean anything to me at that time, coming from two years of studying Computer Science, however I was soon going to find out more than I expected about DevOps, infrastructure pipeline and all of these fancy words.
Maybe it’s worth mentioning that I did not have any preference at all about what to work on, I was really there to try out anything and everything I could. So, not long after I joined my team for a new project, me and another placement student were asked if we wanna work on the infrastructure side (whatever that is), as the team needed more help, having only one WebOps engineer. Making it specifically clear that I got zero knowledge on this, I started this journey. :)
All of our infrastructure work is done using AWS EC2 instances and subnets, Terraform, Puppet and to integrate all of this and also create our pipeline, of course, we used Jenkins.
Why are we doing CI/CD pipeline?
Now, why do we need a pipeline and all of these tools in the first place? Imagine you’re a developer, you write your code and because you’re most probably using a version control system, such as git, you’ll be pushing that to some branch. What happens next? How does your code get tested and integrated with the rest of the team’s code, so that everyone would be confident that it won’t break anything. Well, it’s not elves that take everyone’s code, package it nicely with a ribbon on top and make it ready for the whole world to use, instead, this is what people call Continuous Integration and Continuous Delivery, i.e. CI/CD pipeline and it is done by DevOps/WebOps people. :)
But where are all of these tests being done and how do people make their software available for the world? This is where infrastructure environments come into play, and the most used and known environments are:
These environments, are nothing else than a bunch of AWS EC2 instances, that are being created using Terraform and then configured, i.e. installing your application’s dependencies and code using Puppet, this is also known as deploying your code. Dev comes from development and this is actually the developer’s laptop, so usually within a team you’ll have multiple development environments where developers write their code and run tests locally, such as linting and unit tests. Then, they merge their code to master, however as I said before, we need to have the confidence that this won’t break the existing master code, which is why we have the Int environment, i.e. integration. This is where we would like to run tests on the feature branch, such as integration, acceptance, security tests or anything else your team needs to do before merging to master. The next on the list is Preprod, which is the preproduction environment, by the name you can tell this is an additional step before going live with your code and is usually containing a point in time of the master branch (done via git tagging/versioning). Here you can run performance tests or even rerun some of your previous tests to make sure that the new code doesn’t cause any issues. Lastly, we have the production environment which is the actual service open to the public.
Here is another nice visual representation, where QA stands for quality assurance, and this relates to the Int environment, whereas UAT stands for User Acceptance Tests.
Until now, everything seems fine, we’ve got different environments to perform different tests and we managed to link the code from the developer’s machine into production. But what happens if we discover a bug in production and we need to fix it? This means we need to revert to a previous stable version of the code while developers will work on fixing the bug, after which the whole process will start again going through each environment. Thus, we now have a cycle, the CI/CD cycle :)
Now that you know more about the different environments, you should know that CI officially refers to the development practice consisting of developers having a shared repository where they constantly merge to one master branch, integrate their code to receive feedback and detect bugs as early as possible. CD, on the other hand, refers to code changes being picked up and released to production. The aim here is to automate the process of building, testing and releasing the code as often as possible, having minimal manual intervention. How can we achieve this? Well, when it comes to infrastructure, you need something efficient, verifiable, to be able to backtrack and find the source of your errors, but at the same time it should integrate nicely with all of your other tools. The tool my team used for all of this, and the most used one at the moment, is Jenkins. Jenkins has a plugin for pretty much anything you wanna do, you just need to tell it and Jenkins will do it.
And if you’re still wondering why we should automate this process, imagine you don’t use Jenkins. Then, you’ll need to:
- Manually ssh into the EC2 instance of an environment;
- Run the puppet command or whatever tool you’re using for configuration management;
- Run all of the tests (e.g. unit, integration, acceptance, security etc.) manually for each repository.
This can be easily done for one server but imagine if you have to repeat this process for multiple servers. This is a pain even if things are going well and there are no errors, but that’s usually not the case. So, this can get very messy even for a small project as you’ll need to figure out what is causing the issue. Long story short, everything about this is wrong, so why not making use of Jenkins and automate all of these processes so you can easily monitor each step, receive feedback quickly and debug more easily.
My first thoughts on Jenkins were pretty much like “oh, this is just an UI based tool used to run tests” and I couldn’t be more wrong. First of all, I’ll explain a bit how Jenkins works, proving why my initial view on Jenkins was wrong, followed by some tips and lessons I’ve learned while using it. :)
You can have a look on the official Jenkins website for installation and documentation, also there are loads of good tutorials on the internet to start, so I won’t bother you with those (this is the one I used and you can get it for free as part of the 10-day trial). What is probably good to know is that all the Jenkins configuration is found by going to ~/.jenkins, which, if deleted, next time you open Jenkins at localhost:8080, it will go through the whole installation process again.
So, using the Jenkins UI, you can configure your job and give instructions for Jenkins to run everytime you build that job. Also, the more plugins you install via the Manage Jenkins option, the more tools and features you’ll be able to integrate depending on the needs of your project.
However, instead of using only the UI to provide instructions there is the concept of Jenkinsfile which exists along your application code in the repository and is written in Groovy syntax. Thus, you can have version control for your infrastructure code! :) Now, where does Jenkins get this source code and where does it go? This is a great example of a plugin which should probably be the first one to install as you can’t do much without it, the git plugin. Once enabled, you can configure your job with a git repository URL which Jenkins will clone inside its workspace (usually accessed via /var/lib/jenkins). This is where Jenkins can compile the code, install packages, run tests and further deploy it to another environment. However, you should know that Jenkins can wipe the workspace everytime you build the job, as we want each job to start with a clean, fresh workspace.
By now you should have a clear distinction and understanding between a job and a build, if not, here it is:
- A job is a project configured by .xml files created automatically by Jenkins.
- A build is the process of running a job.
- (because life is full of surprises) An artifact is the result of a job execution. Since Jenkins is wiping out the workspace everytime, we want to save the code in an artifact. :)
In order to run the jobs, Jenkins is using executors, and if you look more into the pipeline job and the two types of writing the Jenkinsfile for it, declarative and scripted, you’ll see there are different ways of allocating a node, i.e. an executor for your pipeline job, as there are nice features that come with the plugin, such as parallelism, stashing, locking, etc. There are also two types of executors: heavyweight and flyweight, the latter being used only on the Jenkins master. Of course, this is the case if you’re using a master-slave Jenkins infrastructure.
Jenkins Job Builder
Remember how I said that Jenkins knows how to run jobs by using .xml files created automatically from the configuration you setup in the UI. But we can move away further from the UI and automate this process as well by using Jenkins Job Builder YAML files to describe our jobs, which will automatically turn into the .xml files that Jenkins needs. JJB is written in Python so you can just pip install it and you’re good to go! To help you start, I’d say this tutorial is pretty good. I didn’t have the chance to do anything too complex with JJB as we still wanted to use the Jenkinsfile, but the combination of these two gives you great advantages:
- Version control everything;
- Automate as much as possible;
- Templates for jobs that are almost identical which comes very handy in an organisation with several projects each using their own environments.
Maybe you can tell by now that, in my project, I was assigned with creating the Jenkins pipeline and I was surprised of how many new things I’ve learned in not even a month. One thing that I enjoyed a lot is the fact that being a DevOps person implies knowing and understanding everything that’s going on with the code from the way it’s developed and tested (and there’s a ton of tests), to deploying and going live, as well as maintaining and monitoring.
This was a very quick introduction to DevOps and there are many other useful features and topics to talk about for this matter, however I will stop here for now and leave you with some tips and a picture to show how nice Jenkins is. :)
- Start with basic skeletons for both JJB and Jenkinsfile and add more functionalities and stages to your pipeline as you go.
- Make use of other groovy files to define functions which are then called in the Jenkinsfile.
- Make sure you’re not mixing the declarative and scripted syntax of the Jenkinsfile pipeline.
- You can find a syntax snippet generator by going to the job configuration under the pipeline tab.