Creating Kubernetes Cluster on Azure with Terraform and Python using CDK for Terraform

7 min readAug 15, 2020

Intro & Motivation

Defining infrastructure by writing code has been widely adopted last years. Terraform is one of the leading infrastructure as code tools and known by providing intuitive language, low barrier to get working with, and a lot of providers(integrations to places we are trying to automate).

Files for Terraform is mostly written in HCL, which is a language used for configuring many HashiCorp products besides Terraform. Terraform’s HCL options are widening day by day and handles many detailed/corner cases. It is also really easy to write and read, even for non-programmers can easily define their infrastructure in it. Still, in some cases, developers still need to be dynamic about the resources we are trying to define as code, define infrastructure needs according to responses of custom integrations in their programming languages. Note that writing custom integrations is possible with Terraform and there are a lot of examples covering that.

Bonus: It is also possible to employ linters/code completion/etc. for Python or TypeScript and they should work!

How to Write Terraform Files in Python?

As we said before, Terraform takes our definitions in HCL. In addition to HCL, Terraform also accepts JSON formatted files. This means that it is possible to produce JSON files in proper format and Terraform will recognize this files without converting to HCL.

Python and TypeScript are also very common in the last years. So even if a feature is not exists in HCL (yet) or you have a code that should generate infrastructure definitions automatically, integrate with your projects, create SaaS that should create computing resources, … there is a chance to define Terraform resources in Python and TypeScript as well.

Instead of learning Terraform’s JSON format, manually scanning module docs and writing classes/methods to create proper definitions, HashiCorp released CDK for Terraform (details in the link) that automatically does heavy lifting for you by creating classes for which providers you are using in your preferred language (Python or TypeScript).

CDK for Terraform also generates the JSON file in the format that Terraform understands.

Let’s see how to create an Azure Kubernetes Cluster using Terraform CDK.

Installing CDK for Terraform

In order to use CDK for Terraform, we should install it using via npm (most of it is written in TypeScript). It requires Terraform, Node.js, and Yarn installed on your computer. For up-to-date install instructions, HashiCorp Learn will be a great resource. Just follow the instructions in that link.

If you use Docker and don’t want to install a package with npm, I prepared a Dockerfile that contains all of them. It is also available as an image on Docker Hub so you may use it directly to try contents of this post without building image. I tagged my image for CDK for Terraform v0.0.14 as guray/tfcdkc:0.0.14. So in that case, there is no requisite other than Docker. Just follow the steps and you should be ready.

Run this if you are using Docker

docker pull guray/tfcdkc:0.0.14

Starting a New Project with CDK for Terraform

A project in Terraform-CDK needs include a couple of files to be useful. cdktf command helps us to generate these files automatically.

If you have installed with npm, go to an empty directory(or create) and run the command:

$ mkdir cdktest && cd $_
$ cdktf init --template python

OR if you are using Docker:

$ mkdir cdktest && cd $_
$ docker run -it --rm -v $PWD:/code guray/tfcdkc:0.0.14
(in container)/code # cdktf init --template python

After running the command, it will ask you a couple of questions:

CDK for Terraform init steps asking whether to Terraform Cloud for storing state

If you are using Terraform Cloud and want to store Terraform state there, you can write “yes” and login. In this post, we will store the state locally. So we will say “no” here.

It will ask a Project Name and Project Description, write answers or just press enter to proceed with the default values. It should look like this:

CDK for Terraform asking project name and description

Afterwards, it will create an empty project for you:

$ lsPipfile       Pipfile.lock  cdktf.json    cdktf.out     help          imports       main.py

What Happened? What are These Files and Why They Are Here?

cdktf command line interface has created a Python virtual environment with pipenv and it installed cdktf module in it using pip.

By default, it bootstraps the project for AWS(Terraform’s AWS provider). This is defined in cdktf.json and we will change that file soon. Generated Python modules for Terraform’s AWS provider is resides in imports directory.

Pipfile and Pipfile.lock are for pipenv to record & lock installed module versions & requirements.

help file includes a basic documentation showing useful subcommands of cdktf. Taking a look at that file will be useful.

main.py is the vital thing for us here. We will write our definition of Azure resources (AKS in this case) here. It includes a fundamental template to work on, so we will add our resources/providers here.

cdktf.out is a directory that includes our JSON file (output).

Coding Azure Kubernetes Service With CDK for Terraform

Now we have main.py and a working project.

At this point, you can use your IDE and its features to define resources here. Many IDEs will automatically recogize the virtual environment created by pipenv and use it. This way they will autocomplete your code by using cdktf module and other modules created automatically for defining resources.

Defining Azure as a Provider to CDK

By default, CDK for Terraform creates a config file that includes AWS as a provider. We can use any number of Terraform providers(there are a lot!) at there. Let’s add Azure to the list. cdktf.json file will look like this(you can also remove AWS provider if you will not use):

{
  "language": "python",
  "app": "pipenv run ./main.py",
  "terraformProviders": [
    "aws@~> 2.0",
    "azurerm@~> 2.22.0"
  ],
  "codeMakerOutput": "imports"
}

After adding the provider, we should ask cdktf to generate Python modules for these providers:

$ cdktf get   
⠴ downloading and generating providers...
... after a while 
Generated python constructs in the output directory: imports

It may take some time but you will not need to change it if you are not changing the config file(which will probably be very rare). Now, import directory is updated. Just make sure your code completion saw the new modules in this directory to write code easily.

Note that if you want to use Terraform modules, it is also possible. Just define them in this file and run the ‘cdktf get’ command again.

Importing Terraform Resources in CDK

The complete source code is at the end.

Now we have the modules we need. Start with importing them:

from imports.azurerm import \
    AzurermProvider, \
    KubernetesCluster, \
    KubernetesClusterDefaultNodePool, \
    KubernetesClusterIdentity, ResourceGroupConfig, AzurermProviderFeatures

If you need to know how to decide which Python modules should be used, they are named like Terraform resource names. Also writing a couple of characters from resource name will trigger autocompletion to suggest related modules.

We have imported required modules and are ready to define our resources.

Defining Terraform Resources in Python

There is a comment line sayin # define resources here” in our code. We will add the lines after it, in the same method (which is __init__).

Firstly, we should define our provider(Azurerm in our case):

        features = AzurermProviderFeatures()
        provider = AzurermProvider(self, 'azure', features=[features])

Features is empty in our case(required by the provider even if it is empty, see the docs). For details like KeyVault, Terraform docs will help to clarify and write.

Afterwards, create a default node pool for our cluster:

        node_pool = KubernetesClusterDefaultNodePool(
            name='default', node_count=1, vm_size='Standard_D2_v2')

In this example, node count is set to 1 and VM size is Standart_D2_v2. You may need to change them according to your needs.

We will use a resource group which already exists in our Azure subscription. Either select an existing one or create a new one and provide its name:

        resource_group = ResourceGroupConfig(name='OUR_RESOURCE_GROUP', location='East US')

Replace OUR_RESOURCE_GROUP and location. This will help Terraform to create the AKS cluster in that resource group & region.

Kubernetes cluster identity is needed to create the cluster, so let’s define it:

        identity = KubernetesClusterIdentity(type='SystemAssigned')

There is currently nothing to change(only supported type is ‘SystemAssigned’ right now), so we will define it and continue.

Finally, define our cluster:

cluster = KubernetesCluster(
            self, 'our-kube-cluster',
            name='our-kube-cluster',
            default_node_pool=[node_pool],
            dns_prefix='test',
            location=resource_group.location,
            resource_group_name=resource_group.name,
            identity=[identity],
            tags={"foo": "bar"}
        )

Now we can arrange the parameters like cluster name, DNS prefix, and tags to fit our requirements.

At this point we are ready to synthesize JSON file for Terraform. Before continuing, here is the complete source (main.py):

Deploy the Cluster

Firstly we will generate the JSON file:

$ cdktf synth
⠼ synthesizing ...

After command is completed, there will be a file named “cdk.tf.json” in cdktf.out.

We can give this file to Terraform and use Terraform subcommands like plan/apply/destroy/… using:

$ cd cdktf.out
$ terraform init
...
$ terraform apply
...

OR, we can use cdktf’s subcommands to apply the same procedure:

$ cdktf deploy

This command will automatically run cdktf synth and terraform apply.

Note that you need to have az installed and logged in, in order to apply this JSON file otherwise it will throw an error.