Enabling an effective CI/CD pipeline for micro-services

Published in

Altitude

11 min readFeb 2, 2022

Altitude is a platform to simplify the process of building smarter accommodation, as our slogan is “Making things simple”. Well, to be honest, I always think it should be “Making things simple for our customers” because the technology behind our system is anything but simple. Altitude offers a wide range of smart app and cloud-based platforms, from plug and play to an all-encompassing guest platform. So yeah, the slogan probably should include our development platform.

Joking aside, the system which fuels Altitude is constructed by applying the micro-service architecture, and at present, the total number of services are in the hundreds, with the counter showing no sign of stopping anytime soon.

A snapshot of an early micro-services map, each connection is service-to-service communication, or later links via a-sync messaging such as Kakfa or a MQ. Each node is clickable opening up service details, versions, releases and more.

As you might already know, a system such as Altitude should have multiple development stages (production, development, user testing, demo, etc) and we are no exception. We actually have more development stages as we need to make absolutely sure our customers will never have to experience any technical issues or downtime that could impact their business. But this leads to one issue: Money.

Cloud-based system are amazing, I wholeheartedly agree, but not so much when you get the bill at the end of each month. Of course we spare no expense when it come to any environment which affect our customers such as PROD, PRE-PROD,… But what about development? The development environment are only used by our developers so we don’t really need the High Availability with 100% uptime, that our PROD environment has. The most important thing is the CI/CD model which provide the seamless deployment process. Is there any way we can keep it as cheap as possible while still able to archive the CI/CD standard?

The answer is yes we did, and how we archive that is what this article is about. In this article i will show you how to automate the deployment process of services in a development environment while keeping the cost as low as possible.

The goals

To make it easier to follow, let us go over the results which we wish to achieve for our development environment:

Automatic deployments for each commit — developers across our environments work closely so each deploy needs to be considered across apps and web
Alerts to team — let them know something has happened, a new deploy and what changes has happened
Simple to setup — single configuration and metadata files
Automatic testing triggers — trigger tests from unit test, postman and more to ensure we’re still operational
Security and dependency reviews — ensure we’re not releasing anything with any known vulnerabilities
Data driven analytics — metadata files which contain dependent services, environment variables, and enable our custom service map and service management service
Save the cloud cost at much as possible.

Lets dive in.

Hundred of micro-services, one poor operator

Sound pretty bad, right? But only if you try to manually do everything. Even if we want to save some money, this is never an option as in the long run it actually create much more problems than just money.

But for this part, to make a point of how much work required to deploy a single micro service , let us assume that our Operator for some reason want to manually deploy a service in our Development Stage.

So Anthony — Altitude’s most promising Developer — has finished his new service called “SMS Sending” service. Of course as good a developer as he is, the service is stateless, Docker-ready, and accompanied by unit test scripts. Now with such service ready to be deployed, and the Operator want to manually deploy this service to Development environment, here is what he will do:

First, pull the source code of “SMS Sending” to his work machine

Second, make sure the service is working as intended:

Deploy the service inside Docker.
Run the unit tests. All tests must pass.

Third, check source code quality:

We make use of SonarQube to rate our source code, and unless the result reach a certain threshold, it is time for Anthony to check his code again.
Check any dependencies for known vulnerabilities.

After passing all the steps above, it is time to deploy “SMS Sending” service:

Build a Docker Image for the service, let call it “SMS-Sending-Service” then push it to a Docker Registry.
Access the server, pull “SMS-Sending-Service” image and start a container using that image and name it “SMS-Sending-Service-Container”.
Config the reverse proxy or load balancer to work with “SMS-Sending-Service-Container”. In Development environment, we use the ever popular Nginx.

As the service is deployed and ready to be used, there are a few more tasks he need to do in our CI/CD flow:

Inform the developers via Slack that their service is ready
Note down the technical detail of the service into our custom services management service.

And our Operator has successfully deployed the service to the Development environment, what a joyful moment, except there are 99 or more services to go. Time to automate this process.

Altitude’s CI/CD solution for our Development Environment

Why do I keep emphasizing “Development Environment”? Because as I already mentioned above, we have multiple stages, and the Development stage are the only stage different from the rest.

While the Production environments are much more polished, make use of the AWS Cloud latest technologies, with top quality architecture, security, high availability (100% uptime) and responsiveness. The Development environment, on other hand, is only used by our development team, so we keep the infrastructure as simple as possible for easy debugging, and to save money.

But just because we want to save money doesn’t mean we exchange the automation for some money back, far from it. What we did is to save money by using free solutions instead of AWS Cloud Services as you might already know, those services, as good as they are, but cost a lot. The Development environment still have everything an automation development process should have, and it take care of everything after developer commit his/her code.

So what solution/s did we pick? Well there are a lots of options, but the one which can cover most of our needs, and best of all, open source and completely free is Jenkins.

Before I go into details of how Jenkins can help us archive our goal, lets go over a few terms we will encounter in our next sections:

Bitbucket: Git-based source code repository hosting service. This is where we store our source code.
AWS: Our system are running on AWS Cloud.
AWS ECR: aka Amazon Elastic Container Registry. This is where we store our docker images.
Jenkins: free and open source automation server. We will deploy Jenkins as our automation control center.
Jenkinsfile: Jenkins task script written in Groovy.
AWS EC2 : Amazon Elastic Compute Cloud. In development environment, our micro services are deployed in Ubuntu-based EC2 instance.
NGINX: Web server that can also be used as a reverse proxy, load balancer. We use Nginx instead of AWS Load balancer in development because it is free.
Ansible Playbook: a blueprint of automation tasks — which are complex IT actions executed with limited or no human involvement. Most of our manually task will be converted to Ansible script and triggered by Jenkins
Services management service: Altitude’s number of services continue to grow, and with more than a hundred services, we need an effective way to manage the information of each service, its technology, documents, tests API schemas and the relation between services. At first we tried to use Google Docs, as well as some map drawing tools, but to no avail. So we decided to make our own solution, and this service is the result of that. Services management service is integrated into our CI/CD flow and will automatically pull the service information, relations using the metadata included in each service, then display the data in human-friendly UI, as well as provides lots of relation maps,…

Continuous Integration with Jenkins

Simply put, Jenkins allows us to automate all the manual tasks (as mentioned above) except for the part where the developer pushes their commit to repository.

With Jenkins acting as our main automation center, our CI/CD model will look like this:

In reality there can be a lot more steps in our pipeline, but for the sake of keeping the article simple, I simply leaved them out and only keep Sonarqube and Services management as an example.

Now let us go over each section

Repository: where everything starts

This step is the only step which require human interaction in our CI/CD model. Whenever developer commit push to a certain branch, the whole CI/CD flow will start.

Developer commit a push to build branch of the service repository. Let call it dev-deploy.
Bitbucket send a request to the webhook which is set in the service repository configuration.
Jenkins pipeline job catch the webhook request, check if it match the branch inside the job configuration and if it is, pull the source code into Jenkins local storage, then trigger the Jenkinsfile to perform all the remain steps.

Inspection: (bad) code shall not pass

The inspection step verifies two things:

The source code quality: check for memory leak, bad coding convention, etc…
The dependent libraries: check for high and critical security vulnerabilities in all dependencies

If the quality pass a certain threshold the inspection will be considered a success.

As mentioned above, we will make use of Sonarqube for this task. I wont go into detail of how we setup Sonarqube as it is out of this article scope but you can simply install it into an EC2 instance using Docker.

It is to no surprise that Jenkins already has plugins to support Sonarqube given how popular it is. The detail to setup said plugin can be found here: Sonarqube Jenkins Integration

Fun story: The amount of “code smells” Sonarqube reports will also get a equivalent amount of 💩emojis in Slack!

Build: deploy and test

This step will wind-up a version of the service inside a Docker container (at Altitude, all services must be dockerized) and perform all the unit tests on it. Note that during the testing, Jenkins still keep the previous version of that service running, the new version is deployed into a different container.

Once all the test passed with “success” status, the new version will be release to Development Environment.

Release

Continue from the previous step, Jenkins will point the Nginx to the new version, and tear down the previous one.

As mention in the previous section, each of our service when commit is accompanied with a metadata file, written by our developer in JSON format. This file is included in the source code, and provides extra information about the services such as which technologies the service uses, the list of dependent services, list of 3rd party services it required,… Jenkins will pickup this metadata , parse it and call our Service Management Service to update the current service record. This help us keep track of all the services in our system and alert the required parties or systems about any major changes.

After that depend on your configuration, Jenkins will inform the developer that the whole process finished. There is multi ways to do this like email, sms,… but our favorite method is to send a message to Slack.

The chart below shows the full deployment flow.

Deployment process with Jenkins at its core

Verdict

Jenkins make the whole deployment process seamlessly and the Operator will have a much easier time now as each service can now be deployed by the developer in charge of it. He/she only need to work on each service the first time it was deployed (write the Jenkinsfile configuration).

BONUS: Altitude Infrastructure for development stage

In case you wondering, here is an overview of our infrastructure for our development environment:

A VPC : Amazon Virtual Private Cloud
A single Availability Zone: for other stages we recommend having more than 2
2 Subnets: 1 private, 1 public
An EC2 which host Nginx inside public subnet: Instead of using aws load balance which cost money, instead we use a Nginx (free) to act as a load balancer. Since AWS micro instance is pretty cheap (even cheaper if you use reserved instance), this save you a bit of money. This instance can be accessed from anywhere if need since it is in public subnet, but we only allow access from our office.
An EC2 which host Jenkins: In our case we categories Jenkins tasks into two groups:
- Normal tasks: Task which wont need much processing power, for example a task to push and pull, deploy a service.
- Heavy tasks: Task which require a lots of CPU or RAM. For example: build our CMS web UI.
Normally you will need a strong instance type to deal with both types, but that isn't cost effective as heavy tasks are rarely called. So here is our solution:
- Use t2.medium to host Jenkin, all normal tasks will be performed here
- Use EC2 Spot instance to handle heavy tasks: which allow us to quickly boot up a instance to handle heavy task, then turn off when task finished.
Multiple EC2 to host our services: The number and size of each instance depend on how many services you have and how much ram they use. Make sure to always limit the service docker instance max ram, and then do the math to decide which EC2 instance you need. For example, 100 services, each use around maximum 128 mb ram will require you to have a at least 12G ram , so a t2.large and a t2.medium will do the trick.
Jenkin EC2 and Micro Service EC2 are put into private subnet: They can’t be accessed from anywhere except from inside the VPC. As you can see in the diagram, there is absolutely no reason to access them directly in the first place

I hope you have a good understand how our infrastructure work. If you have any question or notice any part which need correction, please leave a comment. Thank you for reading our article.