Image for post
Image for post

A case study of contributing to a Terraform Provider

Jakub Warczarek
Jul 10, 2019 · 13 min read

Have you ever worked with Terraform and some resources have been missing? We had this exact situation while working with Alibaba Cloud for one of our customers. Here is the story about contributing to an official Terraform Provider for Alicloud.

Advantages of Alicloud

Image for post
Image for post

Alibaba Cloud (or Alicloud) is starting to gain some traction, especially when you have a need for providing an adoption of cloud native workflow in the Chinese region. More and more enterprises are interested in it. Beyond the Chinese market, Alicloud has data centers all over the world. In contrast to AWS and Azure, Alicloud Chinese regions are first class citizens. Every new feature firstly comes up in China in contrast to AWS and Azure Chinese regions which seem a bit forgotten. What is more sending data between Chinese regions and the rest of the World is smooth. Due to similarities to AWS in naming convention, Alicloud offers a flat learning curve for everyone who knows AWS already. Elastic Compute Service sounds pretty familiar, right?

IaC in Alicloud

Image for post
Image for post

IaC (Infrastructure as Code) guarantees repeatability, scalability and maintainability of a provided solution, as well as eliminates toil of manual work. For Alicloud the official recommended way of doing IaC is using a popular open source tool Terraform. Terraform is a multiplatform tool. We can use the same syntax and workflow for working with Alicloud as with every other platform supported by Terraform e.g. AWS, Azure, GCP, etc. Only particular Resources are different. This is achieved by Terraform Providers which are specific and independent for each platform (official list here). They take full responsibility for delivering features. Thus Terraform is as good in coverage of features from a specific platform as its Provider for it. Unfortunately, sometimes you have a chance to stumble on missing features. Working with Alicloud and Terraform is a generally smooth experience, but…

What to do if the resource is missing?

The second missing resource was setting an enforced password policy for all users in an account. I wanted to have a resource available like alicloud_ram_account_password_policy for configuring it. The same case, perfectly feasible with GUI, but had not been supported in Terraform. Thus second issue #1072 had been added. When you add an issue maybe you can at least try to resolve it...

Let’s do the job

This article assumes that the reader knows the basics of Terraform and Go (the programming language in which Terraform is written).

I refer only to commits which I have made. Of course, later someone can improve my solution to provide better implementation, update SDK, refactor, etc. This kind of collaboration is the main benefit and true beauty of open source software.

How Terraform works?

Terraform has plugin-based architecture. There are two types of plugins Provisioner Plugins (responsible for executing scripts after creation or on the destruction of the resource, not covered by this article) and Provider Plugins (covered in details below). Developers can easily extend Terraform by writing new plugins or compiling modified versions of existing plugins. Providers that are a part of Terraform Provider Development Program are officially approved and tested by HashiCorp.

Terraform architecture:

Terraform Core:

  • main binary which provides CLI command terraform
  • reading and interpolating configuration files and modules
  • construction of the Resource Graph and plan execution
  • using remote procedure calls (RPC) for communication with Terraform Plugins

Terraform Provider Plugins:

  • invoked by Terraform Core over RPC
  • all Providers used in Terraform configurations are plugins
  • authentication with the Infrastructure Provider and define Resources that map to specific Services by wrapping a Go SDK for a specific cloud provider

Plugins are discovered by Terraform in strictly defined order:

  • . - very useful during plugin development
  • place where terraform the binary is located e.g. /usr/local/bin — for airgapped installations
  • terraform.d/plugins/<OS>_<ARCH> - sometimes used in Terraform Enterprise
  • .terraform/plugins/<OS>_<ARCH> - probably the most popular way — automatically downloaded Providers
  • ~/.terraform.d/plugins or %APPDATA%\terraform.d\plugins — user plugins directory when you want to cache a specific version of Provider locally
  • ~/.terraform.d/plugins/<OS>_<ARCH> or
    %APPDATA%\terraform.d\plugins\<OS>_<ARCH> — user plugins directory, with explicit OS and architecture.

Repository structure (Alicloud Provider as an example):

Every Provider follows the same structure of the repository, coding standards and conventions. Only some specific details are different. After you have learned how to contribute to one Provider you will find it easy to contribute to others.

Directory alicloud contains files in the naming convention:

  • TYPE_NAME.go contains implementation
  • TYPE_NAME_test.go contains acceptance tests

Where TYPE can be:

  • data_source
  • import
  • resource
  • service

and NAME e.g.:

  • alicloud_account
  • alicloud_ram_roles
  • etc.

Anatomy of a resource:

To add a new resource, we need to implement a function that returns *schema.Resource. It contains a definition of all fields of a resource (e.g. name) and implementation of four supported actions:

  • Create
  • Read
  • Update
  • Delete

New resource also has to be registered in function Provider()which returns terraform.ResourceProvider This is a kind of an entry point for the whole Terraform provider. More about this is described in the later example.

For maintaining information about managed infrastructure, configuration and metadata Terraform introduces the concept of State. It is stored by default in a local file named terraform.tfstate, but it can also be stored remotely in a Backend. Everything related to configured resources with Terraform is saved in it.

Check if it is doable

RAM Account Password Policy — issue #1072
Setting passwords policy is related to Resource Access Management (RAM) service in Alicloud. This is a similar service to AWS Identity and Access Management (IAM) basically implements the same logic. I found two required calls to implement this resource SetPasswordPolicy and GetPasswordPolicy in an OpenAPI Explorer with ready to use boilerplate code in Go.

Assume role — issue #1068
Providing the ability to assume a role is more challenging because it is not adding a completely new resource rather extending an authentication method of the whole provider. Firstly I examined available ways of authentication in SDK docs. Go SDK supports the RamRoleArn authentication method. As you can see documentation is laconic. I decided to read the source, which has shown that there is the even better method NewRamRoleArnWithPolicyCredential. Besides authentication, it offers also an ability to constrain privileges by using an additional policy.

Provide implementation

Now inside test directory, we can invoke these commands for real-world testing.

Every change in code and recompilation of a Provider requires invoking terraform init because Terraform checks the version of binary by calculating a hash of it and storing it inside the file test/.terraform/plugins/<OS>_<ARCH>/lock.json.

Also useful for debugging is setting a variable TF_LOG=DEBUG to print detailed information about an executed command. There are several levels available: TRACE, DEBUG, INFO, WARN or ERROR if you want to read more click here. Use it also if something is not working as expected, to provide more information about the encountered bug.

Performing kind of ad hoc print debugging can be convenient, due to the plugin architecture of Terraform makes it hard to attach a standard Go debugger to the running Provider. Detailed output can be also used during running acceptance tests.

RAM Account Password Policy — issue #1072
For creating a new alicloud_ram_account_password_policy resource I added a new file terraform-provider alicloud/alicloud/resource_alicloud_ram_account_password_policy.go in the repository and created a below resource. Everything related to resources is placed in a common alicloud package.

Every parameter available in API call for setting account password policy is set by parameter in resource schema e.g. minimum_password_length, require_lowercase_characters etc. Type checking and other validation are provided by Terraform out of the box. Every parameter has a default value (the same as specified in API docs) set by the field Default. RAM Account Password Policy is not a typical resource. You can not create or delete it in Alicloud account. It only supports modification. To ensure consistent behavior, every state before embracing automation with Terraform of this configuration is lost. Naturally, I mentioned this in docs. When the resource is destroyed settings just come back to default values. That feature of RAM Account Password Policy implies that operation Create is the same as Update thus it can be provided by one function resourceAlicloudRamAccountPasswordPolicyUpdate(). Of course for operations Read and Delete I needed resourceAlicloudRamAccountPasswordPolicyRead()
and resourceAlicloudRamAccountPasswordPolicyDelete() respectively.

Below implementation of resourceAlicloudRamAccountPasswordPolicyUpdate()

It takes parameters from the schema, passes them to SDK call, and saves Terraform state with fixed id "ram-account-password-policy" (not random because password policy is only one per Alicloud account). The structure *schema.ResourceData provides methods for interaction with Terraform State.

Above operation resourceAlicloudRamAccountPasswordPolicyRead() makes a call for API to get all information about account password policy and save them in Terraform state. This workflow is typical for many Terraform resources. For deleting function here is an exception. Instead of removing a resource (for a typical case is supported directly via SDK), it sets defaults value (as I mentioned above account password policy does not support deletion, only modification).

In the file terraform-provider-alicloud/alicloud/provider.go I had to register this new resource to be available in Terraform.

As I mentioned earlier function Provider() is some kind of entry point of every Provider. In DataSourcesMap requires map[string] func with names of all resources as keys and corresponding functions as values. All existed resources have to be included to be usable.

Assume role — issue #1068
Creating Terraform configuration you specify a provider block. It can be omitted if its body is empty (Terraform will infer it from other resources). This block represents a configuration for the Provider named in its header. It depends on the type of Provider and mostly used for specifying regions, authentication settings, and so on. So it is the right place where I wanted to add assume_role property to allow configuration like below.

For each repository source code of this is stored in
terraform-provider-alicloud/alicloud/provider.go. The same place where all resources are registered.

I decided to put details of assume_role implementation to separate function assumeRoleSchema() which returns *schema.Schema (the same as for the properties of resources described previously). By following the same conventions, Terraform makes contributing easy for newcomers.

By adding the below code, I ensured that these properties when being passed by a user are available for use in *schema.ResourceData struct the same as for ordinary resources.

Everything related to authentication and connection to Alicloud account is a part of a connectivity package. As you can see the provided configuration is passed to the structure config of the type *Config (I had to extend one with fields introduced to store data passed in assume_role). This struct is placed in alicloud/connectivity/config.go. It supports private method getAuthCredential for obtaining credentials depending on a chosen authentication method. Thus it could be extended with credentials.NewRamRoleArnWithPolicyCredential(...) SDK method from docs.

A type *Config supports method Client() for returning a structure *AliyunClient. It basically contains the original config and some additional fields. What the most important it implements all methods required for creating clients for specific resources available in Alicloud e.g. WithEcsClient(...), WithRamClient(...) (used in issue #1072) etc. These methods use the previously described helper getAuthCredential(...) to provide transparent authentication. This is how authentication works for Alicloud Terraform Provider and had been extended with the ability for assuming role.

Add tests

Of course, required credentials have to be provided. In the case of Alicloud, before running tests, ensure that these environment variables are set:

Setting variable TF_ACC=1 ensures that we are conscious that tests are run on a real account, which credentials provided by environment variables.

RAM Account Password Policy — issue #1072
I added a new file terraform-provider-alicloud/alicloud/resource_alicloud_ram_account_password_policy_test.go with test suite (this is specified by _test suffix in filename).
This file contains the function
TestAccAlicloudRamAccountPasswordPolicy(t *testing.T)
which naming follows standard Go test pattern —TestXxx(t *testing.T).
In this function Terraform acceptance test framework takes a raw string with Terraform resource configuration for test steps e.g.

Executes full Terraform workflow with the creation and removing of resource and checks if everything goes smooth.

Assume role — issue #1068
Unfortunately, things related to provider block are not strictly covered with tests. It is not a special case only for Alicloud Provider, AWS does the same. Maybe they just skip it because every time any resource is tested provider block is used under the hood. On the other hand, is quite strange that it has not got a separate test suite. So I have not provided any acceptance tests for this functionality only performed some manual tests and it has been accepted.

Add docs

RAM Account Password Policy — issue #1072
Firstly I created a new file
terraform-provider-alicloud/website/docs/r/ram_account_password_policy.html.markdown. This page in Markdown format is one to one transferred to the official documentation after a release. Secondly, a new resource has to be also added to the table of contents in file
under RAM section as a reference to the previously added file. As a result, it looks like this on the official website.

Assume role — issue #1068
This feature is configured inside provider block of Terraform configuration.
I only added a section about assume_role in existed documentation in file terraform-provider-alicloud/website/docs/index.html.markdown.
Nothing more is required.

Create pull requests and get them merged to master

Image for post
Image for post
Photo by Nghia Le on Unsplash

These pull requests have been merged to master and included in the official release of Terraform Alicloud Provider 1.46 🎉


In Nordcloud we highly value an engineering approach. We do not settle for what is given, instead we try to shape reality. When you believe in “hold my beer, not my horses” something like above is more like a challenge than a problem. Let’s check also our GitHub. We are always looking for talented people. If you enjoy reading this post and would like to work with public cloud projects and shape reality with cutting edge technologies on a daily basis — check out our open positions here.

Nordcloud Engineering

The place where clouds are born

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store