Pluggable Nginx configuration with S3 and Terraform

Published in

adidoescode

5 min readSep 17, 2021

The paradigm of Things as Code (infrastructure, configuration…) has revolutionised the way complex software systems are built and maintained. Being able to explicitly define the relationships between subsystems and share information between them makes development both faster and less error-prone.

Today, we terraform code. Tomorrow, Mars. — Today, we Terraform AWS. Tomorrow, Mars. (Photo by Daniele Colucci on Unsplash)

However, tools like Terraform present some friction when there is not a central source of truth, such as when some resources are managed manually or when a system is divided into subcomponents with individual states.

In this post we’ll showcase the pattern that we are applying to configure a Nginx proxy for multiple services while maintaining it as a separated project.

Scenario

There is a Nginx cluster that acts as reverse proxy of a bunch of backend apps. We do this instead of directly making the apps public through the Load Balancer due to some internal requirements.

For this we need to ensure that:

The DNS record of each app points to Nginx.
Nginx is configured to map the DNS name of each app with its internal endpoint.

Extra caveat: we want to run the Nginx cluster on top of an EC2 Autoscaling Group, which rules out the possibility to use provisioners like you would with individual instances. We have to make do with the Launch Template’s User Data.

Initial approach

If we manage both Nginx and the apps in the same Terraform project, we can leverage the power of Terraform’s variables to connect the different elements of the system.

Let’s illustrate how it works with this very basic non-functional pseudocode (copypaste at your own risk):

# NGINX RESOURCES ###############################

resource "aws_lb" "nginx" {}

resource "aws_launch_template" "nginx" {
  user_data = <<EOT
cat > /etc/nginx/conf.d/backend.conf <<EOF
server {
  listen       80;
  server_name  ${aws_route53_record.backend.name};
  location / {
    proxy_pass http://${aws_lb.backend.dns_name};
  }
}
EOF
EOT
}

# BACKEND RESOURCES #############################

resource "aws_lb" "backend" {}

resource "aws_route53_record" "backend" {
  name = "backend.example.net"
  type = "A"
  records = [aws_lb.nginx.dns_name]
}

Here, we define a Launch Template for the Autoscaling Group of the Nginx cluster and its public-facing Load Balancer. Also, we create the internal Load Balancer of a backend app and its Route53 DNS record.

By passing information around with variables we achieve the original requirements:

The Route53 record “backend.example.net” points to the Nginx load balancer by reading aws_lb.nginx.dns_name.
The Nginx’s Launch Template creates the configuration file to map the backend app DNS name (aws_route53_record.backend.name) with its internal endpoint (aws_lb.backend.dns_name).

Problems with the initial approach

This setup works perfectly fine with one or two backends. Maybe we could stretch it to three or four. But in the long run the solution doesn’t scale for two reasons:

Taking aside that what we did in the Launch Template is messy at best, the size limit for the User Data script is 16KB.
It’s not possible to separate the apps into their own project due to the bidirectional dependency with Nginx. It all has to go up at the same time.

What can we do?

Pluggable approach

The situation we are facing is quite similar to the old egg-and-chicken programming problem, so we may as well borrow the dependency injection pattern and adapt it.

Let’s take a look at a new iteration of our previous pseudo Terraform code:

# NGINX RESOURCES ###############################resource "aws_lb" "nginx" {}resource "aws_launch_template" "nginx" {
  user_data = <<EOT
aws s3 cp ${aws_s3_bucket.nginx_conf.bucket_domain_name}:* /etc/nginx/conf.d
EOT
}resource "aws_s3_bucket" "nginx_conf" {}# BACKEND RESOURCES #############################

resource "aws_lb" "backend" {}

resource "aws_route53_record" "backend" {
  name = "backend.example.net"
  type = "A"
  records = [aws_lb.nginx.dns_name]
}

resource "aws_s3_bucket_object" "backend_conf_file" {
  bucket = aws_s3_bucket.nginx_conf.id
  key = "backend.conf"
  content = <<EOT
server {
  listen       80;
  server_name  ${aws_route53_record.backend.name};
  location / {
    proxy_pass http://${aws_lb.backend.dns_name};
  }
}
EOT
}

Now, for the Nginx cluster we declare a new S3 bucket and modify the Launch Template, so that the instance downloads the config files at startup. On the other hand, the resources for the backend app now include the definition of a S3 object that will be uploaded by Terraform when we apply the changes.

With these resources we still achieve the original goals about DNS and Nginx configuration, but there is no longer a bidirectional dependency. Nginx doesn’t need to know about the backend apps during the terraform apply; it will update itself when we trigger an Instance Refresh in the Autoscaling Group.

The backend apps still need to know aws_s3_bucket.nginx_conf.id and aws_lb.nginx.dns_name, but those can be the output of the Nginx project, which can be later retrieved from a different project using terraform_remote_state. Therefore, we are free to move the apps into their own projects with independent state files.

Finally, we could add a Lambda function that triggers an Instance Refresh when a file in the bucket is modified. That function would also be defined inside the Nginx project to keep it self-contained. This way we won’t even need to directly interact with the Nginx cluster when a configuration change takes place in one of the backend apps.

An alternative solution for this last step would be implementing local provisioners in the backend projects that trigger the instance refresh from your machine when you do the terraform apply. However, this is an inferior option because it adds a new dependency from Nginx to the apps (aws_autoscaling_group.nginx.id), and will not be aware of any changes to the configuration in the S3 bucket made outside Terraform.

Immutable AMI approach

Up to this point we could walk the extra mile and deploy immutable AMI images. All it would take would be making the Lambda function trigger a CI pipeline that downloads the files and creates a new AMI, then trigger the Instance Refresh from the pipeline.

Yet, even though immutable deployments are generally a good practice, in this particular case there would be no real difference in the quality of the resulting instances. What’s worse, we’d instead incur in longer deployment times, add the burden of AMI lifecycle management and make our tooling more heterogeneous.

Still, your case may benefit from making the effort.

Closing up

As we have seen, the aws_s3_bucket_object resource can be used to inject configuration in a Terraform native way. By doing this, we can decouple systems that need to be aware of each other and break bidirectional dependencies.

Of course, this is not a universal solution and we need to address the usual security concerns about putting stuff in S3, but this patterns has allowed us to simplify development.