Developing a public Terraform provider — Part 2: provider structure

The internals of a Terraform Provider

Published in

IBM Cloud Infrastructure as Code

8 min readJan 23, 2019

This is the second blog in a series looking at the mechanics of developing a Terraform provider for public use. See the first post for an introduction to this series.

A Terraform provider is much more than just a binary executable. A public provider should have everything a user requires to be self-sufficient and avoid the need to go back to the developer. It also has to be reliable and as bug free as practical.

Once you have worked through the exercise of writing a custom provider, and then onto an MVP, it can be a surprise at how much more effort is necessary for a public release. I estimated this was at least 4 times the original MVP effort and examples to guide you are sparse.

Beyond the Go source code for each Terraform resource, there are usage examples, html documentation, test cases and client API libraries. Look at any of the providers on Github for examples of provider development and use them as a template for your own provider development. All follow the structure as set out here.

Provider reliability and correct operation is a major user concern and to the overall usability of the provider and cloud service. In future articles I will cover how the Terraform test framework can be used to prove the reliability of the provider.

In this post I dive into the content of a Terraform provider and cover some of the implementation aspects that I learnt along the way. To visualise the structure and components of a provider I have broken out the main directories and files below.

Evident in this figure is the split between the provider code and the client api library. A well tested and robust client library is essential to a reliable provider. In my experience, development of a well tested client API package comes first, or at least a fully implemented initial set of API functions. I found the developing the client library drives many of the provider design decisions, the supported resource attributes, the approach to error handling and aspects including the Terraform unique ID for a resource.

The naming convention for Terraform plugins is defined in the documentation. terraform-<TYPE>-<NAME> For a provider it is terraform-provider-<NAME>. By convention the <NAME> is the cloud service provider if the provider supports multiple resources. The <NAME> is used in a number of places in the provider to link provider components.

In the following sections I will explore the major directories and files.

examples

This folder contains working Terraform configuration examples to illustrate the usage of the provider resources. Sub-folders contain examples for each resource type. Typically these are examples of the provider resources by themselves, though broader examples of usage with other resources is beneficial. The IBM Cloud provider includes both, with an example showing usage of a Cloud Internet Services (CIS) global load balancer and pools resources to create a resilient multi-region website.

<NAME>

By convention this folder is named after the service provider, in the IBM case its ‘ibm’. As illustrated it contains the Go source for each resource and data_source supported by the provider. On my first pass at provider development I omitted the corresponding test files for each resource and data_source. These follow the resource naming convention with the addition of _test.

Testing of the provider is essential to ensure the integrity of the provisioned resources. During development I have seen provider crashes delete entire resource entries in the state file. The provider documentation includes an introductory section on testing. Provider testing is a major topic in itself that I will cover in a later article.

Provider.go

The provider.go file is the root of the provider. It defines the provider inputs and maps the supported Terraform resources to their corresponding Go implementations. An overview of provider.go can be found in the Terraform writing a custom provider documentation.

Provider.go defines 4 components:

Provider Schema — map to the provider configuration inputs (API keys, defaults, timeouts)
DataSourcesMap — All of the data_sources supported by the provider mapped to their corresponding go function
ResourcesMap — mapping of resources to their corresponding go function.
ConfigureFunc — definition of the initialisation function for the provider, providerConfigure, is called when Terraform determines that the provider is required to deploy configuration.

The providerConfigure function initialises the client session data structures for the APIs and for IBM Cloud performs IAM security authentication with the users API key. It returns the config data structure containing the initialised API sessions and IAM oauth2 session token used for client API authentication.

The config data structure is persisted for the duration of the Terraform command execution and passed whenever the provider plugin is called for a supported resource. For the IBM provider initial IAM authentication using the API key and session setup is therefore only performed once for the duration of the command execution (Apply, Delete). Cloud API calls are made using a transient IAM session token.

The simplified code example here shows IBM Provider() definition including the CIS resources. A point to note here is that all files in the provider are included in the package <NAME>, as here package ibm.

package ibmimport (
 "github.com/hashicorp/terraform/helper/schema"
)func Provider() terraform.ResourceProvider {
   return &schema.Provider{
     Schema: map[string]*schema.Schema{
       "bluemix_api_key": {
       Type:        schema.TypeString,
       Optional:    true,
       Description: "The Bluemix API Key",
       DefaultFunc: schema.MultiEnvDefaultFunc([]string{"BM_API_KEY", BLUEMIX_API_KEY"}, ""),
    ...
    }DataSourcesMap: map[string]*schema.Resource{
   "ibm_cis":                     dataSourceIBMCISInstance(),
   "ibm_cis_domain":              dataSourceIBMCISDomain(),
   "ibm_cis_ip_addresses":        dataSourceIBMCISIP(),
   ...
   }ResourcesMap: map[string]*schema.Resource{
   "ibm_cis":                      resourceIBMCISInstance(),
   "ibm_cis_domain":               resourceIBMCISDomain(),
   "ibm_cis_domain_settings":      resourceIBMCISSettings(),
   "ibm_cis_healthcheck":          resourceIBMCISHealthCheck(),
   "ibm_cis_origin_pool":          resourceIBMCISPool(),
   "ibm_cis_global_load_balancer": resourceIBMCISGlb(),
   ...
   }ConfigureFunc: providerConfigure,
}func providerConfigure(d *schema.ResourceData) (interface{}, error) {
   ...
   return config.ClientSession()
}

Config.go

This file contains the client session data structures and initialisation code for all of the cloud APIs used by the provider. For each supported API, session initialisation is performed by a call to the relevant client API library. For the IBM Cloud provider, the bluemix-go library is imported and includes packages for IBM Cloud Internet Services, authentication, client sessions, etc. The bluemix-go function cisv1.CisServiceAPI performs the session configuration for the CIS API service. I found other cloud providers on Github follow a similar approach to configuring client sessions.

These definitions are the ones added for the CIS service. Additional services require similar entries.

ClientSession defines the function calls for all supported cloud services, to retrieve the session parameters for the client API being invoked. The service specific API setup function is called from the resource function implementation (resource_ibm_cis.go).

The struct clientSession holds the initialised session configurations for each supported API for the duration of the Terraform command execution.

package ibmimport (
   "github.com/IBM-Cloud/bluemix-go/api/cis/cisv1"
)type ClientSession interface {
   CisAPI() (cisv1.CisServiceAPI, error)
}type clientSession struct {
 session *Session 
 cisConfigErr  error
 cisServiceAPI cisv1.CisServiceAPI
}// CisAPI provides Cloud Internet Services APIs ...
func (sess clientSession) CisAPI() (cisv1.CisServiceAPI, error) {
   return sess.cisServiceAPI, sess.cisConfigErr
}cisAPI, err := cisv1.New(sess.BluemixSession)
if err != nil {
  session.cisConfigErr = fmt.Errorf("Error occured while configuring Cloud Internet Services: %q", err)
}
session.cisServiceAPI = cisAPI

vendor

New to me was Go’s approach to dependency management of client libraries. ‘Vendoring’ means tracking your dependencies and their versions and including those dependencies as part of your project. It may increase the size of the package but makes downloading and compiling the package a whole lot easier.

The vendor folder contains all the packages required to compile the provider. Alongside the Hashicorp Terraform helper packages there could be tens of vendor packages, as is the case of the IBM provider. The client API library is vendored into this folder.

Client API library

The Terraform documentation leaves the whole topic of developing and testing client API libraries to the developer.

The starting point for provider development is the client API library and it should be separate from the main provider. Terraform best practice is for the client code to be maintained in its own package and to be vendored into the provider package.

In my enthusiasm to get the MVP working, a mistake I made was to include the client code for accessing the IBM Cloud APIs within the provider itself. It saved effort at the time and avoided the need to understand vendoring. Hashicorp reference this anti-pattern in the notes on writing a customer provider. It wasn’t until I started to test the provider against real resources that the reason for this practice became very clear. It is way easier to verify the correct operation of when the provider and API client library code are maintained and tested independently.

The Terraform test framework and logging provides little insight into client API operation. Consequently, the client library has to be well tested before it is used with the provider. It has to be robust and able to handle the whole range of interactions and failure scenarios that the API exhibits. The Go test packages are best suited to this task.

The good news is that one of the features of Go is that out of the box it has a great package for testing. As again this is a large topic, I will cover this is an a separate article where I will look at how Ginkgo and Gomega can be used to validate and run integration tests.

Website

The website folder contains all the documentation for configuring and using the provider with Terraform. All documentation is written in markdown and an erb template used to generate the index of resources and data_sources, in the folders `r` and `d`. As before look at one of the many Github examples.

Main.go

Go requires a main.go file to define the default executable and entry point for the Terraform provider binary file. Providers are defined as plugin’s, with main() consuming the Terraform plugin library to create the provider plugin. Main.go is included in the writing a custom provider documentation. It is the same for all providers. Note the package main definition. This is the only place it is used.

package mainfunc main() {
        plugin.Serve(&plugin.ServeOpts{
                ProviderFunc: func() terraform.ResourceProvider {
                        return Provider()
                },
        })
}

In this second article I set out to explore the structure of a Terraform provider to show that there is more to it than at first glance and to highlight some of the less obvious considerations. I hope one of the key points that you take away is that testing is a major aspect of provider development.

In the next post I will look at how Terraform represents data internally and the relationship between HCL, the resource schema and the client API library.