Changing a running Kubernetes Cluster Permissions (a.k.a. Scopes)
In a hurry? Check it out the TL;DR by the end of this with all the commands you must know to make it work.
This cluster is our first one when we switched from Heroku to Google Cloud. Back then we had just some idea how Kubernetes works and no idea at all how Google Cloud works.
It was a good move and as our knowledge grows we decided to do not start over again and learn how to keep fixing our mistakes.
On this cluster, that we named cluster-1, we had no idea how it would interact with the other GCP services, and to be truly honest, we had no idea of all the services available.
Our goal back then was to replace our previous hosting but keeping the containers idea going on.
Recently we decided to adopt PubSub to handle messages between systems and then we got this error from our consumer. It has no authorization to reach our topics.
Of course this could (and probably should) be resolved by issuing a specific service account to it that restricts what it can see but this is a kind of sophistication that now isn’t concerning us.
Not just on this, I rather to go baby steps instead of get it all perfect at once, not only because perfect changes with time and perception of reality but because the team needs time to embrace all the knowledge that leads to that.
Understanding scopes
Scopes inform the access level your cluster nodes will have to specific GCP services as a whole. They'll be the default to any service deployed on those machines.
As this was my first time dealing with scopes on the command line and not on a dashboard, I must confess my ignorance on how should I address to them other than clicking on a checkbox, what happens solely during the cluster creation.
Getting your current scopes from a cluster node
I went for a scopes list from one of my instances metadata. To achieve that I just got the name of one instance on the cluster and then get its metadata.
To get a list of your cluster's instances use:
gcloud computer instances list
The outcome should be something like this:
The next step was to get the instance's metadata. The output of this command is huge, so lets see just what matters for us, that is the scopes on the instance's service account.
gcloud computer instances describe [my-instance-name]
Great! Now I know not only how Google names the scopes but also all the active scopes on my instance.
Figuring out scopes aliases
The next step is to see all the available scopes, what I found on this link. (I'll list them here, but keep in mind that they may have changed).
The default ones
default
https://www.googleapis.com/auth/devstorage.read_only
https://www.googleapis.com/auth/logging.write
https://www.googleapis.com/auth/monitoring.write
https://www.googleapis.com/auth/pubsub
https://www.googleapis.com/auth/service.management.readonly
https://www.googleapis.com/auth/servicecontrol
https://www.googleapis.com/auth/trace.append
The optional ones
bigquery - https://www.googleapis.com/auth/bigquery
cloud-platform - https://www.googleapis.com/auth/cloud-platform
compute-ro - https://www.googleapis.com/auth/compute.readonly
compute-rw - https://www.googleapis.com/auth/compute
datastore - https://www.googleapis.com/auth/datastore
logging-write - https://www.googleapis.com/auth/logging.write
monitoring - https://www.googleapis.com/auth/monitoring
monitoring-write - https://www.googleapis.com/auth/monitoring.write
service-control - https://www.googleapis.com/auth/servicecontrol
service-management - https://www.googleapis.com/auth/service.management.readonly
sql-admin - https://www.googleapis.com/auth/sqlservice.admin
storage-full - https://www.googleapis.com/auth/devstorage.full_control
storage-ro - https://www.googleapis.com/auth/devstorage.read_only
storage-rw - https://www.googleapis.com/auth/devstorage.read_write
taskqueue - https://www.googleapis.com/auth/taskqueue
userinfo-email - https://www.googleapis.com/auth/userinfo.email
Deciding which ones to use
You may disagree with that (which I may be easily seen agreeing with you), but we decided to keep all apis allowed from the cluster nodes, which left us with the following scopes to put on our nodes:
- default
- bigquery
- cloud-platform
- compute-rw
- datastore
- storage-full
- taskqueue
- userinfo-email
- sql-admin
Changing the scopes on the cluster
The way that doesn't work
All I found was mentioning to change the scopes on all the nodes metadata and that would be changing them for the cluster. The command would be as follows:
gcloud compute instances set-service-account [instance] \
--service-account [your service account] \
--scopes [scopes aliases/urls separated with commas]
Beautiful, but it doesn't work!
To achieve that you must stop the nodes first but the Kubernetes controller is there to keep things running and as soon as it perceives that one node stoped, it pokes it to start running again.
Long story made short: this is a conundrum.
TL;DR The way that do work
A word to the wise: be careful when doing this. Double, triple check before each command. Murphy is watching!
The idea here was to replace the old pool for a new one that had our desired set of scopes active on it.
To create a new pool you just need to run
gcloud container node-pools create [new pool name] \
--cluster [cluster name] \
--machine-type [your desired machine type] \
--num-nodes [the same amount of nodes you have] \
--scopes [your new set of scopes]
Notice that when creating a new pool from the GCP dashboard you cannot choose new scopes, always using the previously defined ones.
We had decided to grant all scopes, so our scopes parameter was the following one:
default,bigquery,cloud-plataform,compute-rw,datastore,storage-full,taskqueue,userinfo-email,sql-admin
You can see all the options when creating a new pool invoking
gcloud container node-pools create --help
.
Now you need to drain your nodes from the previous pool so Kubernetes will migrate all the resources to the nodes on the new pool.
kubectl drain [node]
After you drain all your old nodes you can safely put down the old pool
gcloud container node-pools delete [POOL_NAME] \
--cluster [CLUSTER_NAME]
The final result
As all the instances have the same set of scopes, it becomes our cluster's default one.