Terraform modules in a big project : how we broke it down

Louis Billiet
6 min readOct 26, 2021

--

Before anything, I have to say that I love Terraform. This tool lets you define your application’s infrastructure entirely with code. Working with code implies working with great tools like Git and embracing their associated workflows. Until shortly, I always worked on monolithic projects where we didn’t have use of modules : one private network, a bunch of virtual machines all of the same size and several firewall rules.

In my current mission, I’m working on a bigger, cloud-based project. That means a set of micro-services hosted on Kubernetes, consuming managed cloud databases (GCP, Aiven, …). This has been a great occasion for me to dwelve into Terraform’s greatness and meet some of its limits.

A bit of context

Before we talk about geek stuff, I have to tell you some specifics of the application and our team’s organization.

The application is a web application intended for internal use on multiple environment and used by multiple country without being multi-tenant, meaning that we have to deploy it once per country. That makes a lot of infrastructure to manage in the long run.

Next thing you should know is that we are a single, DevOps minded, application team. That means several things :

  • everyone in the team ought to have the same rights, as there is no point declining rights based on the profile, and everyone should be able to complete any task on the project.
  • our goal is to be able to deploy our releases continuously. Therefore, we don’t make use of complex multi-tier reviewing and testing process.
  • we have to make sure infrastructure growth does not impede infrastructure evolution.

Designing the infrastructure code

Photo by Firmbee.com on Unsplash

First, let’s talk about the drivers that we have set up for our Terraform development.

In order to reduce toil related to infrastructure maintainance, we wanted to have instances as look-alike as possible from one another. That implies that the code should produce consistent results. In order to minimize the risks of misnaming infrastructure parts, each workspace in Terraform will use 2 variables only : the country that will use it, and the environment tier. Actually, we came up with a third variable because we had multiple development environments in France, but everything is computed from those 3 variables.

Once again, in order to reduce toil related to infrastructure maintainance, we wanted to make sure anything created by Terraform could be easily identified and understood without looking at the source code.

Now, in order to ease infrastructure growth and evolution, we wanted to build infrastructure blocks easily reusable. And when I say, easy, I mean lego-easy. In order to do so, we made sure our Terraform modules will declare as few input variables as possible. Everything application-specific, from naming convention to architectural decisions and configuration as code, will be entirely embedded in those modules.

Photo by Sigmund on Unsplash

Modules break down

Now that we have stated our architectural drivers, let’s talk about what brought you up here : code.

If you have already worked with Terraform, you certainly came up to the point you needed to split your monolithic infrastructure into reusable modules. But the question is “how small is small enough ?”

We decided, in the team, that “the smallest possible without having modules coming in pair” was small enough for our reusable modules. That lead us to create a module to manage one instance of Cloud SQL, another one to manage one Cloud SQL database with its dedicated user, etc.

We call those modules technical modules, since those modules manage one technical aspect of the infrastructure. These modules embed the application’s architectural decisions and the configuration as code aspects. For example, our Cloud SQL database module will manage :

  • the database (with naming computed according to the naming convention)
  • its dedicated user (and its generated random password, of course)
  • ACLs for the user to be allowed to access the database
  • the Vault secret holding everything the micro-service will need in order to use that database

That way, when a developer uses that module, they don’t have to worry about details, they just have to know where to find the secret to connect to the database.

Now that we have a sea of modules, we need a way to assemble everything in a smart way. We have created a new layer of modules hodling resources dedicated to a micro-service. We call those modules application modules. Those modules live with the source code, in the same repository. That way, developpers are able to make the infrastructure grow as the new features need. One example we came upon was a micro-service that now needed to read messages from a Kafka topic. The developer in charge, even though they didn’t know anything about Terraform, just had to copy a snippet from another micro-service already using Kafka, adapt the ACLs parts and voilà !

Photo by Markus Spiske on Unsplash

Now that we have myriads of modules, we have aggregated everything in one big old monolith that will manage everything from the bottom up. It is all orchestrated thanks to Terraform Cloud and we are managing everyone’s infrastructure is a snap ! However, you will see how that big old monolith will become an issue in the long run.

Final hierarchy of such project

Conclusion

We are now a year into managing several instances of our application using this system and so far, so good. The best parts are being able to quickly onboard a new country and easing infrastructure work for the developers. Even better, as our infrastructure grows, we don’t need new people in the team to maintain everything. Counterparts exist and I’ll tell you about it.

First, having a big Terraform project mecanically comes with big run time. In our case, we are managing more than 400 resources per run, and some runs can become as long as 10 minutes. Something you could do to avoid that is to work your deployment pipeline so that each application module become an autonomous Terraform project that is run only when the micro-service is released.

Those big run time led us to split our infrastructure and application deployment process : infrastructure deployment and application deployment are two distinct and un-correlated tasks. This is a shortcoming to our whish to bind infrastructure and application code tightly, but asking our developpers to wait 10 minutes each time they make a change wasn’t bearable. Even more knowing that the micro-service evolve way faster than the infrastructure. Once more, spliting the infrastructure code in order to make application modules autonomous means that you will be able to deploy only the infrastructure related to the modified micro-service.

Last, but not least, modules version can’t be managed dynamically in Terraform. That means you need to have your module’s version hardwritten in the code and committed in your repository. That means you have to commit in the base infrastructure repository each time you release a micro-service. This is quite some toil but thankfuly, Dependabot is here to help ! And once again, I’ll give you the same advice : split your Terraform runs so that you won’t have to reference your application modules in your base infrastructure project !

tl;dr : Our application modules should be autonomous Terraform projects, loosely linked to a base Terraform project that will hold technical modules common to every micro-services.

--

--

Louis Billiet

French SRE@SFEIR, I play, I cook and I tinker. Open-source and cloud enthusiast. You can read me as well on the fediverse : louis@blog.louis.mushland.xyz