terraform, mono-repo and compliance as code

Emre Erkunt
8 min readFeb 26, 2020

--

You must be one of the people who use terraform for your infrastructure as code needs while wondering how you can make things faster and more secure. Well, this is a general concern these days. We all produce config, code in different tools, languages and then spend a substantial amount of time making it more readable, extendable and scalable.

Well, maybe we all are the problem.

The produced code should also have different aspects, like should create value or solve a problem, and also should be reusable for the sake of deduplication. Usually, this kind of discussions concluded with “Lets use modules.”. We all use terraform modules, right ? I could just write pages of suffering stories due to over-modularising everything which is a whole different problem in the world, but I won’t.

No, I won’t. Don’t insist, no… Ok, maybe later.

It is a well-known practice to tag your code when you use modules for locking your root module that sources your module to ensure it always works in the same way even the module code is changed. This way of working should set as a team principle where suitable modules must be tagged and used accordingly.

… but what about the dependency-hell ? What if I have 120 modules, living in 120 different repositories and a module change touches 20 different modules. Does that mean that we need to create 20 + 1 pull requests? If the minimum reviewers set to 2, then it means 21 x 2 =44 peer reviews. Seriously! We just crippled the team with “one module change”, and everybody starts sending the Lord of the Rings memes or gifs while the rest of the day is marked as dead already.

One PR to notify them, one PR to bring them all and in the darkness bind them

Is it the way of working? Should we decrease the number of reviewers? … or maybe for modules make an exception where PRs are not required if the change impacts a lot. Really? Are you going to run blind in the deep dark wood? … or bring them all and in the darkness bind them?

Don’t, just don’t change your way of working on validations. If you think working with PRs are right, then stick on it. If you have smart pipelines or systems around your repositories where everybody can direclty push to master, then stick on it.

The problem is not “how you work”, it is “how you structured your git repositories” for this case.

This resembles how I felt when I first applied the suggestion below

Ok, back to basics. What are the general requirements for a repository that a terraform module exist ?

  1. It should be tagged for the sake of breaking changes
  2. Any change can be testable
  3. Changes should go through a peer review

Then my suggestion is;

Do NOT use micro-repos for your terraform modules. Use one mono-repo.

I have the power!
  • You can tag the whole repository when there is a change/requirement
  • Any change, PR or a push can be testable
  • Any change can go via peer review

Good, but how do we structure that repo? Having lots of failures through the last ~4 years about this topic, I found a directory per module very useful.

a directory structure of a sample mono-repo. See the tags_override change ? :)

So a module change that requires 20 different modules to change is just 1 PR! Even if you add 5 reviewers to that PR, it is super-fast compared with your micro-repos. If you are using Github, then better! You can also combine this with CODEOWNERS, where some modules can have their own maintainers/owners and any change to that modules MUST be approved by that owner.

Great, but how do I source a module that lives in a directory in a mono-repo? Easy, like this :

module “from_mono_repo” { 
source = “git::ssh://.../<org>/<repo>.git//<my_module_dir>”
...
}
module “from_mono_repo_with_tags” {
source = “git::ssh://..../<org>/<repo>.git//<mod_dir>?ref=1.2.4”
...
}
module “from_micro_repo” {
source = “git::ssh://.../<org>/<mod_repo>.git”
...
}
module “from_micro_repo_with_tags” {
source = “git::ssh://.../<org>/<mod_repo>.git?ref=1.2.4”
...
}

What are the push backs for this kind of a structure? Well, if you try to test “every module” on a PR/change, then you may end up having 1.5 hours of CI pipelines. You need to find the changed modules within your pipeline when a PR is released. I just do it like this ;

changed_modules=$(git diff — name-only $(git rev-parse origin/master) HEAD | cut -d “/” -f1 | grep ^aws- | uniq)

There is also another disadvantage that whenever you run “terraform init”, then it will download the whole repo within .terraform directory. Well, I never had a problem with this since I run my pipelines in volatile containers by using AWS CodeBuild. If you are using Jenkins and persistent Jenkins Slaves, then you have a whole different problem.

Don’t make us cry ;(

You simply have the same advantages while using a micro-repo, and as a bonus, you are just reducing the maintenance cost of your modules by using mono-repo.

Frankly, after working like this quite a while, it must be assumed as a crime for anyone who is insisting to use micro-repos for terraform modules.

Excellent, what about unit testing? Do you really need that ? … or what exactly you understand from “unit” testing. Are you really going to test the functionality of an AWS resource if it is created properly? Is this terraform’s responsibility or the API that handles that resource creation? Maybe we should more focus on negative testing and non-idempotent code.

For checking if your code is idempotent or not, terraform provides a great parameter called -detailed-exitcode. Just run ;

> terraform plan -detailed-exitcode

after you terraform apply, and viola. At least you are more confident that your code is idempotent any it does not create a new resource due to a random string or something within the code.

What about negative testing? WTH is negative testing? That is not much different than a unit test really, but you generally focus on negative situations. E.g.

Do not let anyone create an unencrypted and public S3 bucket.

So instead of checking if your S3 bucket is really created, actually you are checking if your code creates a resource based on your policy sets. How do we achieve that? Terraform Enterprise provides an excellent tool for it, Sentinel.

... but we also have some open source alternatives. Nowadays, there are many tools do Static Code Analysis against your HCL. These tools will possibly prevent you to do anything that is unwanted based on general best-practices, but what if one of the tests are not written within the tool .. or worse, what if your situation is a bit different. For e.g. what if you would like to allow public S3 buckets based on some conditional checks, but actually it is a security fault for these tools.

Then, enter terraform-compliance. This tool will not only let you write your tests where you can define WHATEVER you want as your company

terraform-compliance logo

policy, but also helps you to segregate Security and Developers while shifting security to the left. Sounds quite contradicting right? No. How then?

Well, first of all, terraform-compliance uses Behaviour Driven Development, BDD.

just checking if encryption enabled

If you are not satisfied enough, you can go more-in-depth with this ;

going a bit deeper and ensuring KMS is used for encryption

The terraform code that applies to this test would be ;

resource "aws_kms_key" "mykey" {
description = "This key is used to encrypt bucket objects"
deletion_window_in_days = 10
}

resource "aws_s3_bucket" "mybucket" {
bucket = "mybucket"

server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default {
kms_master_key_id = "${aws_kms_key.mykey.arn}"
sse_algorithm = "aws:kms"
}
}
}
}

So actually, tests and failures are understandable by literally ANYONE within your organisation. This is where you can delegate writing these tests to your Security team or your developers with enough security mindset. The tool also allows you to keep these BDD feature files in a different repository. This helps a lot to segregate the responsibility where the change within the code and the change within the security policies that run against your code are two different entities. They can be different teams with different life cycles. Amazing right? Well, it is at least it is for me :)

For more information about terraform-compliance, you can also have a peek on this presentation.

“${this.presentation}”

We solved tons of problems just by using terraform-compliance, especially where Security teams are quite separated from the development teams, and they may not understand what developers do. You already guess what happens in this kind of organisations. Usually, the Security team starts to block anything suspicious enough for them and builds the whole Security mindset based on perimeter security. Oh my…

In many situations, just using terraform and terraform-compliance for the teams who build (or/and run) infrastructure helped a lot to get these two different teams into the same table. When your Security team starts to develop something with the immediate feedback from all the development pipelines, they usually are motivated to do more and more. Well, usually…

So usually, this is how we structure git repositories in a general organisation that does terraform ;

Of course, this is quite opinionated. I was lucky (or unlucky? ) enough to test a more granular structure in several organisations. Unfortunately it did not end well. Happy ending must be hidden within the number 3.

Well, I was just losing myself in “lets not just use buzzwords anymore”, but I think it would be better to write that on another post :) Let me know if you have any success stories where no-one suffered from micro-repos, really curious about it!

--

--