Configuring .gitlab-ci.yml with AWS EC2 for Continuous Integration (CI) or Continuous Deplyment (CD)

Jose Javi Asilis
HackerNoon.com
15 min readApr 23, 2018

--

This was originally a full-long post, but it got so big, that I had to split it into 2! This continues from the post #2, Configure GitLab CI on AWS EC2 Using Docker.

Posts:

  1. [Tutorial — Guide] Installing GitLab, GitLab CI on AWS EC2 from Zero.
  2. Configure GitLab CI on AWS EC2 Using Docker
  3. Configuring .gitlab-ci.yml (This Post)
  4. Troubleshooting GitLab and GitLab CI

#1- Understanding the .gitlab-ci.yml file

The .gitlab-ci.yml file is a YAML file that you create on your project’s root. This file automatically runs whenever you push a commit to the server. This triggers a notification to the runner you specified in #3, and then it processes the series of tasks you specified. So if you push it 3 times, it’s going to run it 3 times! That’s why if you’re pushing multiple you either want a faster runner, or a separate runner per machine.

Note, since we’re using Docker, the tasks always start within a clean state of the image. This means that all the files and modifications that you put or do inside the .gitlab-ci.yml, will be reverted each time you push a commit to the server. You can avoid this by specifying caches.

The content of the files are composed of the keys that you could find in this page. There’s no order you need to follow, but be very careful about indentations. This may make up or break your project. You can check with an online YAML linter to see if it works before pushing.

I’ll be working on an example NodeJS application. With Karma as the test runner. I’ve posted a .gitlab-ci.yml I used in one of my past projects in this GitHub Gist. (Please check it out!)

This page can give you another perspective.

Breaking it down:

The image key grabs an image from the Docker Hub, and uses it as a base image. GitLab will base all the tests based off this image. If you’re doing a project in Ruby, Java, Go, PHP, etc. specify the correct image from the Docker Hub.

This creates a temporary cache folder that prevents from the node_modules, and .yarn to be recreated each CI run (Each time you make a commit).

The before_script tells GitLab to run whatever you’ve specified before anything else. You can consider this as a preparation script.

#1.1 Understanding Stages

Stages are a series of steps that your code goes through in order for it to reach its final destination (Production). GitLab allows you to define any number of stages with any names. You do it by specifying it under the stage key, in the order you want them to run.

Then, GitLab will be running each one of them, step by step. If one of them fails, it prevents the following ones to run.

In the above portion, it will run the build stage first, all the way up to production.

#1.2 Defining the stages’ actions in the .gitlab-ci.yml file

You define what the stage is going to run by first specifying a stage name parent key. This key can be named as you wish and can contain spaces.

For example:

The stage name is Build My App, and it specifies a key called stage that refers to the stage you created earlier, in the stages list.

The before_script runs the same as the one we specified earlier, only in the context of the build stage: Nothing will run until those scripts are executed.

In this case, we’re using yarn insead of npm, and it’s creating a cache folder that will contain all the yarn configuration, that will not be recreated each project’s run (Each time you push to the repo)

#1.3 Tags

If you followed the previous article I wrote about “tags” (Point #3.1, from article 2), this is where we specify them! If the tags match the one we specify in the runner, then this will trigger the runner once its done. You specify each tag in its own line. If you refer to the example above, if you remove the # before node that means, that that specific stage will only work on the runners with the node tag. If you specify no tag (omit the tags key) you can connect to the runner (as long as it’s not locked down to the current project).

Note: 5.4–5.6 It’s a skim overview of the portions of the file. I’m explaining with more detail WHY, how, and What are the contents in #7.

#1.4 Testing environment

Again, it doesn’t matter how you name your stage. In this case, I just called it “Test” to test.

There’s a lot going there. Continuous Integration methodology relies on tests that you run on your local machine. Those tests are accompanied by running them against the actual machine that you’re going to deploy.

Since this is very specific to Node and JavaScript (what my project is made of), I need to prepare the field so they can run perfectly. In this case, I use karma as the test runner to run all of my local tests. It requires of a local web browser, in this case, Google Chrome.

Therefore, I need to issue an installation command of Google Chrome (remember that each time we push, everything starts from a clean state), and run the tests.

If all the test succeed, GitLab will proceed automatically to the next section.

#1.5 Opening a Merge Request

After our testing environment succeeds, we want GitLab to automatically open a merge request that we can successfully merge to master if it passes.

# 1.6 Staging and Production Environments

I will talk about the contents of this in the bottom.

#2 Pipelines

When you push the git repo to GitLab with the .gitlab-ci.yml file on it, it will automatically trigger the pipelines. The pipelines are the stages you defined in your .gitlab-ci.yml. In our case, we have build, test, staging, openMr, and production. Each of those marks that you see in the screenshot above represent each of the stages. A red cross will represent a failed stage. A green checkmark will represent that the test successfully passed. The diagonal bar will identify that the test was canceled.

You can see a command line interface that shows you the development of each of the stages by clicking on the icon, and then clicking on the pop up:

This is the screen that it shows you after the build stage has been successfully completed.

#3- Integration to AWS. How to connect the GitLab instance to the EC2 instance with your project

One of the biggest challenges is to integrate the CI pipeline with your project. As far as I know, GitLab doesn’t offer a native way to do this. You could push your code to AWS Code Deploy and then do the migration through there.

There’s a fantastic step by step guide that will guide you through that process by the stackoverflow’s autronix:

I recommend the approach above over the one that I’m going to show you. Do the following if by any chance the approach from autronix is not working.

This integration conveys utilizing git (We pull the merged repository from GitLab) and upgrade it in place in our EC2 instance, we execute the reload script from npm (We’re assuming we’re using Node in this project) and release the changes. You may see, by now, that this looks more of a hack than an actual solution. This may not work on highly distributed environments in which you require to replicate the codebase across multiple EC2 instance. But again, it’s about having options, right?

#4 — Prepare the EC2 machine that hosts the deployed code.

This approach is inspired by this post.

We treat the EC2 machine, the one that hosts the code in production, as a GitLab client. We create an SSH key that connects to GitLab, and pull the code from there.

If you remember, from the first tutorial:

If you had problems with adding the key, try executing this first (Source) :

One thing about this approach is that I haven’t found a way to make it work with a passphrase, so when it asks you about it, leave it blank!

When you create the key, it’s located under:

Note, this time you won’t be able to copy to the clipbpoard its content unless you install “clip”

Note: This will eat 300+ MB of disk space. Don’t do it unless you’re not constrained by disk space.

Another option is just to execute cat, and copy the output from the command.

To control potential security risks, I recommend you to create a separate user in GitLab that handles only a pull from the repo, and nothing else. You attach the public key to that account.

I created a ghost user in GitLab that handles the pull from GitLab.

Go to /admin on your GitLab address. (You can also click the tool icon at the navbar)

Create a new user:

Uncheck “can create group”. Access level “Regular”, External “checked”.

Go to the project you have your repo with:

Search the new member you created and set it as a Reporter.

Navigate to the Users tab in the Admin area, and click on the name of the recently created user:

Click “Impersonate”, and go to the ssh keys page.

#5- The Staging environment, configuration:

This is where I start explaining what the heck was what I put above.

#5.1- Communicating with the EC2 instance.

We need a way to communicate with the AWS unit. The way we do this is by grabbing the private key (Careful! Sensitive information) of the generated id_rsa (The one we generated inside our EC2 instance) and sending it with our predefined shell script (Which I’ll talk more about in a moment).

The code in the before_script what it does is that it generates a blank file called id_rsa (Which matches the convention for the private key). We populate it with a custom variable (more on that now) each time the project is run.

GitLab CI allows you to store variables in the project’s settings:

Go to Your Project -> Settings -> CI/CD -> Secret Variables

What we’re going to do is that we’re going to grab the content from the id_rsa (private key, the one without .pub) and we’re going to copy and paste its content.

We do the same procedure as we did with the public file (Note this is the extension-less id_rsa):

Or, if you didn’t install clip, copy and paste it from the console:

We’re going to copy paste that value in the “Secret Variables” form, and give it a “SSH_PRIVATE_KEY” (This matches the one in .gitlab-ci.yml and you can see it in the image above)

Once you have it, click “Save variables”.

#5.2- Create a Shell Script for the Staging environment

We still need to indicate GitLab to execute the pull request into our EC2 environment.

For separation of concerns, and maintainability we can specify an external shell file that will execute the pull from the master branch. We call this file .gitlab-deploy.staging.sh You can call this file anything you want. Just remember to specify it in the .gitlab-ci.yml file.

This is how I have my project structured.

.gitlab-ci.yml is in the root. While the shell files are under a folder called gitlab-deploy. Therefore we reference them as ./gitlab-deploy/.gitlab-deploy.staging.sh.

The content of the file is as follows:

As you can see what we’re doing in here is that we’re executing a git pull and an installation of the packages in the Staging Server.

I was cheap, and I was running both: production and testing on the same server (I exposed different ports). I recommend you to have different machines for that.

The $DEPLOY_SERVER variable is another custom variable we created in the secrets variable page with the IPv4 address of our EC2 instance:

Go to Your Project -> Settings -> CI/CD -> Secret Variables

The environment key which specifies name and url is “just for show”. This will show you a button at the stage in the console that will point you the URL you specify there. This is optional and can be omitted.

#6 Automatically Open Merge Requests

GitLab won’t automatically open merge requests. That’s why we have to do some work ourselves to get it working.

This is done through a Docker image from tmaier from GitHub

This is the auto-merge-request.sh file

For this to work, we need to generate a PRIVATE_TOKEN which it’s just a random token we can generate. To have a strong and secure token we can use a password generator or anything else (your choice!).

Put the contents inside the “Secret Variables” as PRIVATE_TOKEN

Go to Your Project -> Settings -> CI/CD -> Secret Variables

#7 Deploy to Production

It’s very similar to the Staging process, but here lies a difference. The main difference between a CI (Continuous Integration) and Continuous Deployment (CD) approach, is that the latter, any change that you do to the code, automatically gets pushed to production.

In GitLab we can specify if we manually deploy it to production or not by specifying a “when” key.

As you can see by specifying the key when:manual we tell GitLab not to push the code automatically to production, and wait for our commands.

On the pipelines page, you click a Playback button to “Deploy to Production”, which is the name you specified in the .gitlab-ci.yml

For last, but not least, check the .gitlab-deploy.prod.sh

If you notice, you are going to see that it’s similar (not to say identical) as the staging one. With the exception that I point it to a different location within my EC2 instance, where the production code lies. You are free to modify this file as well.

#8 The CI/CD pipeline has been configured! Time to push!

Yes!! It’s finally that time!

Commit your file and push to your GitLab instance!

See how your changes start to happen!

That’s it!

WOAH! Thanks for the ride.

--

--

Jose Javi Asilis
HackerNoon.com

Building a legacy by building startups and connecting people through Social Commerce. Always excellent and getting better with 600% of energy. 🔥🔥🔥