
A case study of contributing to a Terraform Provider
Have you ever worked with Terraform and some resources have been missing? We had this exact situation while working with Alibaba Cloud for one of our customers. Here is the story about contributing to an official Terraform Provider for Alicloud.
Advantages of Alicloud

Alibaba Cloud (or Alicloud) is starting to gain some traction, especially when you have a need for providing an adoption of cloud native workflow in the Chinese region. More and more enterprises are interested in it. Beyond the Chinese market, Alicloud has data centers all over the world. In contrast to AWS and Azure, Alicloud Chinese regions are first class citizens. Every new feature firstly comes up in China in contrast to AWS and Azure Chinese regions which seem a bit forgotten. What is more sending data between Chinese regions and the rest of the World is smooth. Due to similarities to AWS in naming convention, Alicloud offers a flat learning curve for everyone who knows AWS already. Elastic Compute Service sounds pretty familiar, right?
IaC in Alicloud

IaC (Infrastructure as Code) guarantees repeatability, scalability and maintainability of a provided solution, as well as eliminates toil of manual work. For Alicloud the official recommended way of doing IaC is using a popular open source tool Terraform. Terraform is a multiplatform tool. We can use the same syntax and workflow for working with Alicloud as with every other platform supported by Terraform e.g. AWS, Azure, GCP, etc. Only particular Resources are different. This is achieved by Terraform Providers which are specific and independent for each platform (official list here). They take full responsibility for delivering features. Thus Terraform is as good in coverage of features from a specific platform as its Provider for it. Unfortunately, sometimes you have a chance to stumble on missing features. Working with Alicloud and Terraform is a generally smooth experience, but…
What to do if the resource is missing?
For us, two important features for successfully introducing Alicloud in enterprise environments were missing. First, the Alicloud provider was lacking resource which provides a way of assuming role passed in a provider
block (place, where the configuration of the whole Provider, is set). A very useful thing for multi accounts setups. This ability has been existed in Terraform Provider for AWS (block assume_role) and has been doable in Alicloud via GUI, but not supported by the Provider. As a first step toward changing it, I tried to find if someone has a similar issue or question. I could not find anything similar so I added issue #1068 to their GitHub.
The second missing resource was setting an enforced password policy for all users in an account. I wanted to have a resource available like alicloud_ram_account_password_policy
for configuring it. The same case, perfectly feasible with GUI, but had not been supported in Terraform. Thus second issue #1072 had been added. When you add an issue maybe you can at least try to resolve it...
Let’s do the job
…sometimes the easiest and most obvious way is preparing some kind of a workaround. Workarounds tend to be hacky, unmaintainable and often messy. Furthermore, there is no guarantee that after a change in Alicloud this workaround will still work. So the best solution was to contribute to
the official Terraform Provider for Alicloud and resolve an issue on our own for benefit of us and the whole community. When solutions are merged to the master branch, engineers accountable for Terraform Provider for Alicloud, take full responsibility for maintaining them. It is not a problem for users only anymore.
This article assumes that the reader knows the basics of Terraform and Go (the programming language in which Terraform is written).
I refer only to commits which I have made. Of course, later someone can improve my solution to provide better implementation, update SDK, refactor, etc. This kind of collaboration is the main benefit and true beauty of open source software.
How Terraform works?
This paragraph contains only a short description of some key concepts of Terraform architecture. When you want to learn more about Terraform internals, how to contribute or even write custom Providers, the best place to start is chapter How Terraform Works from the official documentation.
Terraform has plugin-based architecture. There are two types of plugins Provisioner Plugins (responsible for executing scripts after creation or on the destruction of the resource, not covered by this article) and Provider Plugins (covered in details below). Developers can easily extend Terraform by writing new plugins or compiling modified versions of existing plugins. Providers that are a part of Terraform Provider Development Program are officially approved and tested by HashiCorp.
Terraform architecture:
Terraform Core:
- main binary which provides CLI command
terraform
- reading and interpolating configuration files and modules
- construction of the Resource Graph and plan execution
- using remote procedure calls (RPC) for communication with Terraform Plugins
Terraform Provider Plugins:
- invoked by Terraform Core over RPC
- all Providers used in Terraform configurations are plugins
- authentication with the Infrastructure Provider and define Resources that map to specific Services by wrapping a Go SDK for a specific cloud provider
Plugins are discovered by Terraform in strictly defined order:
.
- very useful during plugin development- place where
terraform
the binary is located e.g./usr/local/bin
— for airgapped installations terraform.d/plugins/<OS>_<ARCH>
- sometimes used in Terraform Enterprise.terraform/plugins/<OS>_<ARCH>
- probably the most popular way — automatically downloaded Providers~/.terraform.d/plugins
or%APPDATA%\terraform.d\plugins
— user plugins directory when you want to cache a specific version of Provider locally~/.terraform.d/plugins/<OS>_<ARCH>
or%APPDATA%\terraform.d\plugins\<OS>_<ARCH>
— user plugins directory, with explicit OS and architecture.
Repository structure (Alicloud Provider as an example):
Every Provider follows the same structure of the repository, coding standards and conventions. Only some specific details are different. After you have learned how to contribute to one Provider you will find it easy to contribute to others.
Directory alicloud
contains files in the naming convention:
TYPE_NAME.go
contains implementationTYPE_NAME_test.go
contains acceptance tests
Where TYPE
can be:
data_source
import
resource
service
and NAME
e.g.:
alicloud_account
alicloud_ram_roles
- etc.
Anatomy of a resource:
To add a new resource, we need to implement a function that returns *schema.Resource
. It contains a definition of all fields of a resource (e.g. name
) and implementation of four supported actions:
Create
Read
Update
Delete
New resource also has to be registered in function Provider()
which returns terraform.ResourceProvider
This is a kind of an entry point for the whole Terraform provider. More about this is described in the later example.
For maintaining information about managed infrastructure, configuration and metadata Terraform introduces the concept of State. It is stored by default in a local file named terraform.tfstate
, but it can also be stored remotely in a Backend. Everything related to configured resources with Terraform is saved in it.
Check if it is doable
We have already known that Terraform is basically a state machine that wraps Go SDK for a specific cloud provider to provide a convenient way of configuring infrastructure with code. So something that we want to implement should be already supported by Go SDK. In Terraform use of direct API calls is considered a bad practice, instead should be abstracted and encapsulated by SDK. For Alicloud official Go SDK can be found here. For searching supported calls, Alicloud provides comfortable OpenAPI Explorer, which gives interactive documentation for API with examples in various language SDKs.
RAM Account Password Policy — issue #1072
Setting passwords policy is related to Resource Access Management (RAM) service in Alicloud. This is a similar service to AWS Identity and Access Management (IAM) basically implements the same logic. I found two required calls to implement this resource SetPasswordPolicy and GetPasswordPolicy in an OpenAPI Explorer with ready to use boilerplate code in Go.
Assume role — issue #1068
Providing the ability to assume a role is more challenging because it is not adding a completely new resource rather extending an authentication method of the whole provider. Firstly I examined available ways of authentication in SDK docs. Go SDK supports the RamRoleArn
authentication method. As you can see documentation is laconic. I decided to read the source, which has shown that there is the even better method NewRamRoleArnWithPolicyCredential
. Besides authentication, it offers also an ability to constrain privileges by using an additional policy.
Provide implementation
For performing some manual testing during development it is useful to use a plugin discovery mechanism. Compile a Provider binary into a specific folder that contains Terraform configuration prepared to use for testing a newly developed resource. Of course, providers have often some Makefile configuration, which can be used to automate some tasks related to building, linting, and testing. In this article, I decided to use bare commands to ensure that the presented approach will work for every Terraform Provider.
Now inside test
directory, we can invoke these commands for real-world testing.
Every change in code and recompilation of a Provider requires invoking terraform init
because Terraform checks the version of binary by calculating a hash of it and storing it inside the file test/.terraform/plugins/<OS>_<ARCH>/lock.json
.
Also useful for debugging is setting a variable TF_LOG=DEBUG
to print detailed information about an executed command. There are several levels available: TRACE
, DEBUG
, INFO
, WARN
or ERROR
if you want to read more click here. Use it also if something is not working as expected, to provide more information about the encountered bug.
Performing kind of ad hoc print debugging can be convenient, due to the plugin architecture of Terraform makes it hard to attach a standard Go debugger to the running Provider. Detailed output can be also used during running acceptance tests.
RAM Account Password Policy — issue #1072
For creating a new alicloud_ram_account_password_policy
resource I added a new file terraform-provider alicloud/alicloud/resource_alicloud_ram_account_password_policy.go in the repository and created a below resource. Everything related to resources is placed in a common alicloud
package.
Every parameter available in API call for setting account password policy is set by parameter in resource schema e.g. minimum_password_length
, require_lowercase_characters
etc. Type checking and other validation are provided by Terraform out of the box. Every parameter has a default value (the same as specified in API docs) set by the field Default
. RAM Account Password Policy is not a typical resource. You can not create or delete it in Alicloud account. It only supports modification. To ensure consistent behavior, every state before embracing automation with Terraform of this configuration is lost. Naturally, I mentioned this in docs. When the resource is destroyed settings just come back to default values. That feature of RAM Account Password Policy implies that operation Create
is the same as Update
thus it can be provided by one function resourceAlicloudRamAccountPasswordPolicyUpdate()
. Of course for operations Read
and Delete
I needed resourceAlicloudRamAccountPasswordPolicyRead()
and resourceAlicloudRamAccountPasswordPolicyDelete()
respectively.
Below implementation of resourceAlicloudRamAccountPasswordPolicyUpdate()
It takes parameters from the schema, passes them to SDK call, and saves Terraform state with fixed id "ram-account-password-policy"
(not random because password policy is only one per Alicloud account). The structure *schema.ResourceData
provides methods for interaction with Terraform State.
Above operation resourceAlicloudRamAccountPasswordPolicyRead()
makes a call for API to get all information about account password policy and save them in Terraform state. This workflow is typical for many Terraform resources. For deleting function here is an exception. Instead of removing a resource (for a typical case is supported directly via SDK), it sets defaults value (as I mentioned above account password policy does not support deletion, only modification).
In the file terraform-provider-alicloud/alicloud/provider.go I had to register this new resource to be available in Terraform.
As I mentioned earlier function Provider()
is some kind of entry point of every Provider. In DataSourcesMap
requires map[string] func
with names of all resources as keys and corresponding functions as values. All existed resources have to be included to be usable.
Assume role — issue #1068
Creating Terraform configuration you specify a provider
block. It can be omitted if its body is empty (Terraform will infer it from other resources). This block represents a configuration for the Provider named in its header. It depends on the type of Provider and mostly used for specifying regions, authentication settings, and so on. So it is the right place where I wanted to add assume_role
property to allow configuration like below.
For each repository source code of this is stored in
terraform-provider-alicloud/alicloud/provider.go. The same place where all resources are registered.
I decided to put details of assume_role
implementation to separate function assumeRoleSchema()
which returns *schema.Schema
(the same as for the properties of resources described previously). By following the same conventions, Terraform makes contributing easy for newcomers.
By adding the below code, I ensured that these properties when being passed by a user are available for use in *schema.ResourceData
struct the same as for ordinary resources.
Everything related to authentication and connection to Alicloud account is a part of a connectivity
package. As you can see the provided configuration is passed to the structure config
of the type *Config
(I had to extend one with fields introduced to store data passed in assume_role
). This struct is placed in alicloud/connectivity/config.go. It supports private method getAuthCredential
for obtaining credentials depending on a chosen authentication method. Thus it could be extended with credentials.NewRamRoleArnWithPolicyCredential(...)
SDK method from docs.
A type *Config
supports method Client()
for returning a structure *AliyunClient
. It basically contains the original config and some additional fields. What the most important it implements all methods required for creating clients for specific resources available in Alicloud e.g. WithEcsClient(...)
, WithRamClient(...)
(used in issue #1072) etc. These methods use the previously described helper getAuthCredential(...)
to provide transparent authentication. This is how authentication works for Alicloud Terraform Provider and had been extended with the ability for assuming role.
Add tests
Acceptance tests are required for every resource. Terraform tries to deliver a safe and predictable way of doing IaC. Thus every test uses real Terraform configuration for creating true infrastructure in the cloud and verifying its existence and configuration. After the test, everything is torn down. Terraform delivers a special framework for doing it. It uses under the hood standard Go testing package. Running this kind of test costs some money and takes a lot of time. Fortunately, we do not have to run all suits of tests, we can only run these for specific resource:
Of course, required credentials have to be provided. In the case of Alicloud, before running tests, ensure that these environment variables are set:
Setting variable TF_ACC=1
ensures that we are conscious that tests are run on a real account, which credentials provided by environment variables.
RAM Account Password Policy — issue #1072
I added a new file terraform-provider-alicloud/alicloud/resource_alicloud_ram_account_password_policy_test.go with test suite (this is specified by _test
suffix in filename).
This file contains the functionTestAccAlicloudRamAccountPasswordPolicy(t *testing.T)
which naming follows standard Go test pattern —TestXxx(t *testing.T)
.
In this function Terraform acceptance test framework takes a raw string with Terraform resource configuration for test steps e.g.
Executes full Terraform workflow with the creation and removing of resource and checks if everything goes smooth.
Assume role — issue #1068
Unfortunately, things related to provider
block are not strictly covered with tests. It is not a special case only for Alicloud Provider, AWS does the same. Maybe they just skip it because every time any resource is tested provider
block is used under the hood. On the other hand, is quite strange that it has not got a separate test suite. So I have not provided any acceptance tests for this functionality only performed some manual tests and it has been accepted.
Add docs
A new feature is useless when it is not documented because probably no one starts using it. Terraform Providers also have a convention for adding docs. It is quite similar to adding a new resource. Let’s learn about this from the below examples.
RAM Account Password Policy — issue #1072
Firstly I created a new file
terraform-provider-alicloud/website/docs/r/ram_account_password_policy.html.markdown. This page in Markdown format is one to one transferred to the official documentation after a release. Secondly, a new resource has to be also added to the table of contents in file
terraform-provider-alicloud/website/alicloud.erb
under RAM section as a reference to the previously added file. As a result, it looks like this on the official website.
Assume role — issue #1068
This feature is configured inside provider
block of Terraform configuration.
I only added a section about assume_role in existed documentation in file terraform-provider-alicloud/website/docs/index.html.markdown.
Nothing more is required.
Create pull requests and get them merged to master
When everything had been finished, I created two pull requests:
- PR #1212 for RAM Account Password Policy — issue #1072
- PR #1217 for assume role — issue #1068

These pull requests have been merged to master and included in the official release of Terraform Alicloud Provider 1.46 🎉
Summary
Open source software gives us real power. We can improve and adjust existed solutions for the benefit of us and the whole community. We should not be afraid about submitting pull requests because rather everybody is happy when you try to support them. In most projects maintainers are willing to help, provide you some feedback and additional guidance. I think we should support the culture of hacking, inclusiveness, and knowledge sharing in the IT industry. This attitude makes us special as a community and probably is one of the key factors for the fast development of technology, which we can observe now. So let’s contribute as often as we can.
In Nordcloud we highly value an engineering approach. We do not settle for what is given, instead we try to shape reality. When you believe in “hold my beer, not my horses” something like above is more like a challenge than a problem. Let’s check also our GitHub. We are always looking for talented people. If you enjoy reading this post and would like to work with public cloud projects and shape reality with cutting edge technologies on a daily basis — check out our open positions here.