A to Z of Google Cloud Platform a personal selection — R — Rolling Updates
I’m going to talk about how you can achieve a rolling update to applications on GCP compute products.
Before I do that I just want to ensure we are on the same page as regards to what I mean when I say rolling update .
A rolling update when I talk about it is the ability to release a new version of software with pretty much zero downtime . While the new version is being rolled out the previous version of the software is being rolled out of the way until by the end of the single step update process only the new version of the software is running . As well as being to roll the update out you must be able to pause the rollout and roll it back if need be. If you don’t have that ability to recover from a rubbish deployment then it’s a one way street!
An alternative way to carry out an update to your application is a canary release. With canary releases you release the new version alongside the existing version but whatever you decide after you have the new version and old version running side by side you don’t just carry on rolling out the update you stop the update process for a period of time . You will eventually switch over to the new release fully or not if the new version turns out to be a bit sucky. Have a look here for a nice walkthrough of canary releases.
I’m not going to debate the pros and cons of each approach though as that would derail this post somewhat I feel. So onwards :-)
Rolling updates with GCE
What I will be discussing here assumes that you “bake” new custom images for each new version.
GCP have a great solution & tutorial ( Thanks @evandbrown) on creating “baked” images using Jenkins , packer and kubernetes. So if you’re interested in understanding how to set up a CI/CD system for baking/creating your own custom images it’s definitely worth a read.
I know you could use a configuration management tool to roll out application level updates across your fleet but by treating the instances as immutable you can easily take advantage of using managed instance groups to achieve rolling updates.
Managed instance groups use instance templates to define the properties for every instance in the group.
Any of the settings you can define in a regular request to create an instance can be described in the instance template, including any instance metadata, startup scripts, persistent disks, service accounts etc.
You can easily update all of the instances in the group by specifying a new template in a rolling update.
To start a rolling update You need to follow the steps outlined below ( see the docs for a detailed walk through).This high level walkthrough assumes you already have a fleet of instances that are defined and managed via managed instance groups.
- Request access to the rolling update feature
- Enable the Instance group updater API , you’ll see this screen
As we’re using GCE you don’t need to create any other credentials and you’re able to start using the API
3. Create a new template with the updated properties. So assuming you have created a new image you can add the name of the image to the template
4. Use the gcloud command or the API to start the update. You can Change the defaults by passing in the values you want.
The optional flags you can change are listed below. They give you a good indication of how you can control the update and meet my requirement of being able to abandon the rollout.
— max-num-concurrent-instances , which defines how many instances are updated at the same time.
— instance-startup-timeout , which defines the maximum number of seconds that the update waits for an instance to start after the updates have been applied. If the instance does not start before the time limit, the updater records the update as a failure.
— min-instance-update-time, which defines the minimum number of seconds that the updater spends to update each instance. The updater starts the next update only when the current update is complete and the minimum update time is spent.
— max-num-failed-instances, which defines the maximum number of instance updates that can fail before the updater records the entire group update as a failure.
The default gcloud command to kick off a rolling update looks similar to this:
gcloud alpha compute rolling-updates start \
--group example-group \
--template example-template \
You can pass the optional flags listed above to override the default values to the gcloud command.
A rolling update can be applied to all instance groups, whether or not they have autoscaling enabled. Have a look at the docs for detail on how autoscaling managed instance groups and rolling updates interact.
Rolling updates with GKE
GKE (Google Container Engine) is GCP’s fully managed kubernetes service and as such when we talk about how to implement a rolling update we are talking about how to update your application or components that you have running on a kubernetes cluster. As I’m talking about GCP I’ll be referring to GKE but the majority of what I describe about rolling updates with GKE actually refers to k8s ( as kubernetes tends to be abbreviated )
The table below describes Kubernetes concepts
Your application container is deployed to a pod.
Containers should absolutely be treated as immutable . Yep I’m not entering into a debate , not leaving you to make up your mind just stating that and leaving it here! With that out the way it implies that you will be creating a new container image when you update your application or micro service component .
The rolling update process basically replaces your running containers with an updated container.
There are two ways that you can manage your application via replication controllers or via deployments so depending on which configuration you have selected the rolling update process is slightly different. Using deployments should be the method used so this is what I’ll talk about.
So what is a deployment ? you may well ask. Deployments provide declarative updates for Pods and Replica Sets (the next-generation Replication Controller). You only need to describe the desired state in a Deployment object, and the Deployment controller will change the actual state to the desired state at a controlled rate for you. ( supported in the version of k8s managed by GKE)
The kubernetes docs talks you through what a deployment is and how to use in some depth so I won’t repeat that.
Here’s an example deployment yaml file called webapp-deployment.yaml
- name: mywebapp
- containerPort: 80
It is used to deploy an image called mywebapp whose version is 1.0.1 and will bring up 3 pods of mywebapp
To implement a rolling update to your application here’s a summary of the steps needed:
- Create a new container image that has your updated application or microservice and pushed this to your image repository
- Using our example above we’re going to update our container image my-app from 1.0.1 to 2.0.0
- You can then do one of two things to carry out the update:
- Create a new deployment yaml file called new-webapp-deployment.yaml that has the image: my-app:1.0.1 modified to read my-app:2.0.0
- Deploy the deployment by running the following command:
$ kubectl apply -f my-folder/new-webapp-deployment.yaml
deployment “webapp-deployment” configured
Alternatively you can directly edit the deployment directly using kubectl edit ( Note unless you want to find yourself in vi then ensure you set your KUBE_EDITOR, or EDITOR environment variables) . Edit the Deployment and change .spec.template.spec.containers.image from mywebapp:1.0.1 to mywebapp:2.0.0
$ kubectl edit deployment/webapp-deployment
deployment “webapp-deployment” edited
This carries out a rolling update . If you follow through with the hello world example in the GKE/K8s docs you can see this in action . Specifically noting that it will not destroy all the pods running the old version but keep some running until there are some new pods with the updated application running. What’s really neat though is that using .spec.strategy.type==RollingUpdate You can specify maxUnavailable and maxSurge to control the rolling update process. The docs have an excellent explanation of how this works
To meet my criteria of a rolling update you must be able to roll back to a previous version and GKE / k8s allows this by implementing the methods .spec.rollbackTo and .spec.rollbackTo.revision or using the kubectl command: kubectl rollout undo deployment So in our example to roll back to the previous version of my-app using the kubectl rollout command we would use this:
$ kubectl rollout undo webapp-deployment
You can also roll back to a specific version .
Rolling updates with GAE
GAE being the compute target that requires the least customer configuration and admin you’d expect it to be the most straightforward to undertake a rolling deployment and you’d be right to think that.
GAE allows you to upload a different version of of your application and cut over to that version. You can also traffic split but that would not meet my definition of a rolling update.
App engine serves traffic by default from the default version of your application ( I know talk about a clumsy sentence!) . You can upload a new version and this will not serve any traffic until you switch over to the new version you have just upload
Using python and App Engine standard to elaborate on that admittedly short description.
GCP provides a tool call appcfg.py. This is used to upload new versions of your python application ( GO uses this tool as well but Java uses maven so check the language specific docs )
Your first deployment automatically becomes the default and serves 100% of the traffic .
Uisng appcfg.py to upload your new version you will then see in the console that you now have two versions of the application
By checking the box beside version 2 you are then able to route all or some of the traffic or migrate all traffic across to this version. To enact a rolling deployment select migrate traffic. Migration takes a short amount of time (possibly a few minutes), the exact interval depends on how much traffic your application is receiving and how many instances are running. Once the migration is complete, the new version receives 100% of the traffic.
When using traffic migration you should use warmup requests to load application code into a new instance before any live requests reach that instance thus reducing thus helping to reduce latency request for your users as you switch over to a new version
You can load 10 different versions and swap over to anyone of them easily.
Note The traffic migration functionality I have described is only available in the App Engine standard environment
The behaviour with the App engine flexible environment is slightly different as there you are creating a new Docker container when you create a new version of your application and there you use the gcloud app deploy command to automatically build the Docker container and switch traffic over to this new version.
You can override this default behaviour by using the — no-promote flag