Cloudformation Status Transition
Stack is a collection of AWS resources which we can manage as a single unit. We can perform operations like CREATE, DELETE and UPDATE on this unit. While we perform these operation the stack transitions from one state to another state. Like from CREATE_IN_PROGRESS to CREATE_COMPLETE. Knowing the transition states helps in debugging any issues. This blog tries to depict the state transition visually.
At a high level, we perform CREATE | UPDATE | DELETE operation for stack. First we create a stack using some template. Once created we can either delete or update the stack. Also we can delete an stack after update. Internally during the each of the above mentioned operation stack transition through multiple state. We will look at them one by one.
So lets start with create operation. Lets see what are the different status through which the stack passes through.
Once we have a syntactically valid template passed to a create stack API the stack enters the CREATE_IN_PROGRESS. In case of the change set API we have extra state where cloudformation review and decided upon what will change. Once it decides of all the changed it will enter into CREATE_IN_PROGRESS. It will keep creating the resources in the mentioned dependency order. If anything fails to create then the stack transitions in ROLLBACK_IN_PROGRESS. During this state it tries to delete the resources it created till now. If its able to delete all those it will go to ROLLBACK_COMPLETE state and then stack will be deleted. But if it fails to delete any resource in between, it will enter ROLLBACK_FAILED state.
For a stack to be able to be updated stack status should not be in the the following status.
In all cases except UPDATE_ROLLBACK_FAILED & DELETE_FAILED you need to wait for the status to change. In other two case you need to fix the error that caused failure and then try for update.
During an update CFN first creates the new resource and then deletes the old resources for supported resource type. Hence we have extra states like UPDATE_COMPLETE_CLEANUP_* and UPDATE_ROLLBACK_COMPLETE_CLAENUP_*. Rest flow is similar to one discussed above. If the stack is not able to update the resource or could not complete rollback then it enter UPDATE_FAILED or UPDATE_ROLLBACK_FAILED accordingly.
When we try to delete the stack for some reason any of the resources in the stack was not deleted then the stack enters DELETE_FAILED status. We need to fix the resource which failed to deleted (like empty the bucket before deletion, remove any other reference created outside CFN).
In the last three diagrams, states in RED color need manual intervention before they can be brought to a GREEN Or ORANGE state. In RED state no further operation on stack will work.
Al time the stack gets stuck in *_IN_PROGRESS status. Pay attention to custom resources, ECS, Certificate, Autoscale resources. Most likely they are the culprits, waiting for a success signal(healthy signal) or for a manual intervention(certificate validation).
Thanks for reading!! Feedback appreciated.