`terraform refresh` is deprecated

Published in

GlobalLogic UK&I

5 min readJul 6, 2021

Update: terraform refresh is now effectively an alias for:
terraform apply -refresh-only -auto-approve.
This is still dangerous. Read on…

Photo by <a href=”https://unsplash.com/@arifriyanto?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Arif Riyanto</a> on <a href=”https://unsplash.com/@arifriyanto?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>

While I was delivering Terraform 101 this week, one of the attendees asked me about a warning in the Terraform docs, which says:

Warning: This command is deprecated, because its default behavior is unsafe if you have misconfigured credentials for any of your providers. See below for more information and recommended alternatives.

This was new to me (indeed it is new), and I had never thought about it, but it makes complete sense. Kinda.
So I played around a little…

Aside: The smallest Terraform configuration

As a quick bracket, let me mention the smallest possible, valid Terraform configuration my team and I came up with during an internal challenge we organised. It’s all of 20 bytes: resource aws_eip a{}.

Indeed, declaring the provider "aws" {} is optional: Terraform knows from the first part of the resource type (up to the first _) which provider provides a given resource. If the provider body would be empty (e.g. if we supply all configuration as environment variables, or config files in the user’s home directory), we can leave it out altogether.

Also, quotes and spaces are optional, as you can see. It’s still best practice, so please don’t start writing ugly code because of this.

Lastly, aws_eip, AWS’ ‘Elastic IP’, has a short resource type name (7 bytes) and does not require any arguments in its body, which was paramount in keeping the size minimal.

Experiment setup

I created a minimal Terraform configuraton, and opened two terminals in that directory. I generated AWS credentials for two different AWS accounts, and set one set of credentials in each terminal window, as environment variables. (In addition to AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN, I also had to set the AWS_DEFAULT_REGION.)

Note: the exact same behaviour occurs if we use the same set of credentials, but a different AWS region—which is arguably more likely to occur. (This behaviour is of course always provider- and resource-specific, as e.g. some resources are not region-specific.)

I could now run terraform init and terraform apply, then confirm with yes in order to get a public IPv4 address that I’m not going to use. (It’s not like we’re short or anything, right?)

Findings

After running the apply, my Terraform state was created locally. The file terraform.tfstate contained the following (abridged):

{
  "version": 4,
  "terraform_version": "1.0.0",
  "serial": 1,
  "lineage": "832b5a17-d263-5775-daad-39354d63a157",
  "outputs": {},
  "resources": [
    {
      "mode": "managed",
      "type": "aws_eip",
      "name": "a",
      "provider": "provider[\"<...>\"]",
      "instances": [
        {
          # ...
        }
      ]
    }
  ]
}

I then ran terraform refresh from the other terminal window, with the AWS credentials for a different account. (Again, they could have been the same credentials, but with a different region, in this case.)
The output was minimal, and does not at all hint at the disastrous effect running this command just had:

aws_eip.a: Refreshing state… [id=eipalloc-0265965418c85c989]

The state file,terraform.tfstate, now contains only the following:

{
  "version": 4,
  "terraform_version": "1.0.0",
  "serial": 2,
  "lineage": "832b5a17-d263-5775-daad-39354d63a157",
  "outputs": {},
  "resources": []
}

Yes: the state file has been overwritten with one where the resources list is empty. (Oh, and the serial has been increased to 2—anything else would be irresponsible, of course).

Why did this happen? When terraform refresh ran with the wrong credentials, it asked the AWS API for up-to-date information about the EIP in the wrong AWS account. The AWS API, of course, replied that there wasn’t any such resource in that AWS account. Terraform then (wrongly, but understandably) concluded that the resource must have been deleted, and removed it from the state.

The fact that there is no longer any reference to the resource(s), especially to their unique IDs (which in this case is the eipalloc value), means that Terraform has no way of finding the lost resource(s) again, even when we run another terraform refresh using the correct credentials. We would have to manually import the resource(s) back into the state, which is very cumbersome. (The command for this would be:
terraform import aws_eip.a eipalloc-012345678, but of course you’ll have to know each resource’s unique ID.)

Can we roll back?

Yes, this could work, if a) you do have a backup, and b) you catch your mistake in time, before you apply further changes and convolute the issue further.

Local state will create a terraform.tfstate.backup file (but only one!). In my case, I could restore it (mv terraform.tfstate.backup terraform.tfstate), and the mistake was undone.

If you use remote state, it might be advisable to enable some sort of versioning on the remote for that reason. Cloud object stores usually allow this as a simple-to-configure option, and Terraform Cloud/Enterprise does it by default as well.

What to do instead?

HashiCorp recommend using terraform plan -refresh-only and terraform apply -refresh-only instead.

The output in each case is explicit about the differences found:

aws_eip.a: Refreshing state... [id=eipalloc-0265965418c85c989]Note: Objects have changed outside of TerraformTerraform detected the following changes made outside of Terraform since the last "terraform apply":# aws_eip.a has been deleted
  - resource "aws_eip" "a" {
      - domain               = "vpc" -> null
      - id                   = "eipalloc-0265965418c85c989" -> null
      - network_border_group = "eu-west-1" -> null
      - public_dns           = "<...>" -> null
      - public_ip            = "54.74.0.231" -> null
      - public_ipv4_pool     = "amazon" -> null
      - tags                 = {} -> null
      - tags_all             = {} -> null
      - vpc                  = true -> null
    }This is a refresh-only plan, so Terraform will not take any actions to undo these. If you were expecting these changes then you can apply this plan to record the updated values in the Terraform state without changing any remote objects.

The user thus gets a very explicit indication that there is a mismatch between the expected state, and the actually configured provider account.

With a terraform plan -refresh-only, it ends there. Terraform tells us the what-if, but does not actually touch the state.

With a terraform apply -refresh-only, the following output is shown after the one above:

Would you like to update the Terraform state to reflect these detected changes?
  Terraform will write these changes to the state without modifying any real infrastructure.
  There is no undo. Only 'yes' will be accepted to confirm.
Enter a value:

Terraform therefore explicitly asks for confirmation before overwriting the state file, after detailing what would be overwritten.
Reasonable.

What’s the difference between a plan plan/apply and -refresh-only?

Since (Terraform believes) the resource(s) we want have disappeared, it will want to re-create them. This will be the additional output of a normal plan/apply: creating the resource(s) that are in the *.tf file(s) but not in the state (or in-memory refreshed state).

That latter part, the intention to (re-)create the missing resources, is what’s left out in the -refresh-only versions of the commands. -refresh-only thus only acts on the state, and is read-only with regards to the provider.