Gitlab and AWS can complement each other. (Image source: Gitlab.com)

Using Gitlab and deploying to AWS

Tech@ProSiebenSat.1
ProSiebenSat.1 Tech Blog

--

by Philipp M

TL;DR;
Want to deploy to AWS and use Gitlab for your code in an easy and secure way? Read on!

The code to this post is located at https://gitlab.com/PSDD-GitLab/gitlab-aws-runners.

At ProSiebenSat.1, we started off using Amazon Web Services (AWS) about three years ago. Being a cloud-native company, all our products and services, as well as infrastructure and code collaboration tools were built using the respective offerings from AWS.
Lately, when starting to put more emphasis towards continuous integration, automated (pre-merge) testing and code reviews, we started to hit the boundaries of the more simplistic tools provided by AWS, specifically CodeCommit, CodeBuild and others. Therefore, we decided to replace the AWS Code Star suite with Gitlab, enabling a more smooth and comfortable solution for code reviews, automated test gates and repository management.

However, with the migration from CodeStar towards Gitlab, we needed to find ways to integrate Gitlab’s CI pipelines with automated testing and deployment to the AWS ecosystem in a smooth and secure way. The rest of this post will present options how this can be achieved and showcase ready-made templates to facilitate your integration of Gitlab with AWS.

In order to deploy resources from Gitlab CI pipelines to AWS, in our case Cloudformation templates and Docker images, we’ll need an IAM role or user with the permission to do pretty much anything, from erasing data from S3 and deleting entire databases to mining crypto currencies. Therefore, this post aims to provide an “as secure as possible” solution to protect the deployment permissions from leaking out.

The simple, but insecure solution

Storing administrative credentials is simple, but insecure.

The first thing that will come into everyone’s mind is to simply store ready-to-use credentials of a technical user as part of Gitlab CI/CD Variables. Those will be provided to each step of your CI/CD pipeline (.gitlab-ci.yml) and you will be able to perform and deploy operations using AWS CLI or AWS SAM .
While some restrictions are possible, namely the ability to mark variables as protected or masked, there are still severe security implications:

  • Credentials can leak and still be printed in plain text by mistake or malicious intent.
  • Depending on your project set-up, many developers or maintainers may have access to those credentials.
  • Once leaked out, anyone around the world will be able to operate your AWS account with administrative rights.

The better approach

The good news: We can do a lot better and avoid storing credentials altogether.
The solution comprises deploying Gitlab Runners within AWS (on EC2 or Fargate) and registering them as tagged and shared Runners.

But first of all, what is a Gitlab Runner?

It’s nothing more than an application running on a host which you provide and which connects and registers itself with Gitlab.com. Each Runner that is able to spawn a new job runs to perform the actual work. Different types of executors exist: Docker, SSH, virtualbox, and more. Via tags, we can control which Runner will be used for a particular CI job. The connection is initialized from Runner to Gitlab, therefore the Runners can stay behind a firewall with all inbound ports closed.

Gitlab Runners on AWS

On most cloud providers, the recommendation would be to use Kubernetes based Gitlab Runners. However, since Kubernetes isn’t a first class citizen in the AWS world, it will be better to use more integrated services. The options include:

1. A single EC2 instance running Docker executors — useful for tiny projects.

2. Controlling EC2 instance + Fargate workers — the scaling and almost serverless solution; proposed by Gitlab.

3. Autoscaling EC2 instances — proposed by Gitlab as well, but out of scope for this post.

Let’s talk about the details of 1. and 2.:

1. Running a single EC2 instance

Template and deployment script:
https://gitlab.com/PSDD-GitLab/gitlab-aws-runners

In this setting, we permanently run a single EC2 virtual machine. On this machine, we install Docker and set up a Gitlab Runner using Docker executor type. This means that for every new CI job, the executor will spawn a new Docker container running on the host machine. Additionally, we enable to EC2 host to perform automated updates through its package manager.

The template (ec2_template.yaml) comprises:

  • The EC2 instance, which can be accessed via AWS Systems Manager Agent (SSM).
  • A security group attached to the EC2 virtual machine, defaulting to not opening any ports.
  • An IAM role assigned to the EC2 machine, granting administrative privileges needed for deployment of AWS resources.

2. EC2 controller + Fargate workers

Image source: Gitlab.com

Template and deployment script:
https://gitlab.com/PSDD-GitLab/gitlab-aws-runners

This set-up uses one controlling EC2 instance, which spins up an AWS Fargate task for each CI job. This is archived by a custom Fargate executor developed by Gitlab. You can find more information about the infrastructure on Gitlab’s documentation pages: https://docs.gitlab.com/runner/configuration/runner_autoscale_aws_fargate/

At ProSiebenSat.1, we wanted our CI maintainers to deploy the necessary infrastructure quickly and easily. Therefore, we developed a convenient Cloudformation template to automate the manual set-up process proposed above.

The deployment template (fargate_template.yaml) uses the following resources:

  • One EC2 instance, used to spin up and control Fargate tasks.
  • An IAM role and security group attached to the EC2 machine.
  • An Elastic Container Service (ECS) cluster using Fargate as backend, plus task definitions.
  • A role attached to the Fargate/ECS tasks, granting them administrative privileges needed for deployment of AWS resources.
  • A security group for the Fargate/ECS tasks, allowing inbound traffic from the EC2 instance.
  • One Elastic Container Registry (ECR) used to store the container running on ECS.

We added a couple of security considerations:

1. The controlling EC2 instance doesn’t use SSH (it doesn’t have a key pair installed), instead we connect solely via AWS Systems Manager Agent (SSM).

2. The EC2 instance will update itself automatically via cron jobs.

3. The Fargate workers will accept SSH connections only from the controlling EC2 instance. This is ensured by the firewall setting.

Deploying & using

To set up and use both the EC2 or the Fargate based infrastructure, you’ll need to perform the following steps:

1. At first, you should inspect and familiarize yourself with the Cloudformation template and adapt the parameters section on top to your needs. You will need a virtual private cloud (VPC) created beforehand, as well as one subnet within the VPC. Another parameter of interest will be the RunnerTag, which has to be used within gitlab-ci.yml to assign a pipeline job to the particular gitlab-runner instance.

2. Navigate to your project on gitlab.com, look for Settings > CI/CD > Runners to obtain a token for Runner registration.

3. Deploy to your AWS account using the deploy_*.sh script and provide the Runner token. Since the script is performing deployment of AWS resources via AWS CLI, the environment you’re running from must be authenticated with AWS and your IAM permissions need to be sufficient for deployment of those resources.

4. Assign a job to the EC2 Gitlab Runner using tags.

5. ONLY REQUIRED FOR OPTION 1 (Single EC2): Configure the job to fetch temporary credentials from EC2 metastore. This is needed since the EC2 instance will run Docker containers. The virtual machine itself handles authentication of AWS CLI automagically. In order to authenticate inside the Docker container as well, we can fetch temporary credentials from EC2 metastore. The code snippet can be copied from the repositories ReadMe:

Initial contents of .gitlab-ci.yml

Final thoughts

We hope that this post simplifies the set-up and maintenance of Gitlab CI pipelines in conjunction with the AWS ecosystem, and that we could help you to boost your project’s CI/CD to the next level.

If you liked the article, follow us on Medium for more tech talk!

Thanks to Manuel Jockenhöfer, Manuel Heller, and Sebastian Döring.

--

--