Mastering the Google Cloud Platform SDK tools

A look at some lesser-known GCP SDK settings and features that might make your day-to-day interactions with GCP more productive.

Jonathan Merlevede
datamindedbe
6 min readAug 10, 2021

--

Did you know that you that you can easily assume a service account identity simply by setting an environment variable? There’s certain features hidden in the Google Cloud SDK tools, such as gcloud, bq and gsutil, that I wish I’d known about for longer. I’ve compiled some of these into a neat story for you!

Photo by Matt Artz on Unsplash

Location of configuration

The Cloud SDK needs configuration to work: credentials, default project settings, and so on. This configuration is stored on your local filesystem.

  • On Unix systems, the default SDK configuration folder location is in ~/.config/gcloud.
  • On Windows, the configuration folder is in %APPDATA%\gcloud. If there’s no configuration there, %SystemDrive%\gcloud and C:\gcloud are also checked.
  • You can override where the Cloud SDK looks for its configuration by exporting the environment variable CLOUDSDK_CONFIG.

You can set all configuration through environment variables

All Cloud SDK configuration can be done or overridden through environment variables. This can be useful to temporarily override your settings for testing, when running gcloud as part of a CI system and when containerizing.

  • The name of the environment variable maps onto the name of the setting as ~ CLOUDSDK_{setting.replace("/", "_").upper()}.
  • For example, the active project — that is, the setting core/project — can be set through the environment variable CLOUDSDK_CORE_PROJECT.
  • The configuration in environment variables takes precedence over the configuration in configuration files.

The SDK can provide its environment with default credentials

Most applications obtain access to GCP resources through the application default credentials or ADC. If the GOOGLE_APPLICATION_CREDENTIALS environment variable is not set, the ADC checks if the SDK has generated application default credentials in the default location and use them if they’re available. You can make credentials available by executing

gcloud auth application-default login

This generates a file at the following location (assuming your SDK configuration is at ~/.config/gcloud):

~/.config/gcloud/application_default_credentials.json
  • Read my other story for a full overview of the ADC flow.
  • The identity you use for application default credentials does not have to be the same identity as the one you use for interacting with the cloud SDK.
  • You can easily use the contents of this file to obtain access tokens yourself, without using gcloud, through a refresh token OAuth grant. Read my other story for an example on how to do this.
  • The ADC flow will just check for a file at this location, and does not “directly” interact with gcloud or the Cloud SDK.

The SDK does not use Application Default Credentials

Unlike most other applications, the Cloud SDK CLI applications do not use ADC to authenticate themselves. The SDK manages its own list of accounts in a separate database (see below).

  • List all authenticated accounts using gcloud auth list . The active one will have an asterisk sign next to it.
  • You can easily switch accounts by changing the core/account setting using gcloud config set account (or, as you now know, the environment variable CLOUDSDK_CORE_ACCOUNT).
  • If you want the Cloud SDK to use a specific credential file to authenticate itself and ignore the authentication settings in its database, you can point the setting auth/credential_file_override to a credential file (user or service account). Again, you can do this by setting the environment variable CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE. Unfortunately, the legacy tools (bq, gsutil) do not seem to respect this setting (see below).

You can use the SDK to temporarily assume the identity of a service account

We all know that it’s not considered good practice to run applications with user credentials, and that we should use service account credentials with permissions specifically crafted for the task at hand. However, practically speaking, switching accounts is tedious, so we don’t always do it.

This is where the globally available gcloud flag --impersonate-service-account can help. If the active account has iam.serviceAccounts.getAccessToken permissions (included in the service account token creator role) on a service account, then you can run any command using that service account’s identity instead by adding this flag together with the service account name.

  • The only role you then really still need to have on your own account is that of a service account token creator, scoped to appropriate service accounts. This is my preferred setup nowadays. Cool!
  • If you want to set impersonation more permanently, you can set it as a configuration setting by setting gcloud config set auth/impersonate_service_account .
  • You now also already know that this also means that you can configure impersonation through the environment variable CLOUDSDK_AUTH_IMPERSONATE_SERVICE_ACCOUNT. Awesome!

Legacy applications kind of suck and functionality is split

Certain popular Google services, such as Cloud Storage and BigQuery, predate the current gcloud tool. Unfortunately, this means that we still interact with them using the legacy tools bq and gsutil.

Most operations related to Storage and BigQuery are only available from the legacy tools. You cannot use it to e.g. submit a query to BigQuery. However, certain new operations may also not be available in the legacy tool: for example, if you want to cancel or list running BigQuery jobs, you will have to do so using gcloud alpha bq jobs .

Besides the fact that functionality is split, why do I think the legacy applications “kind of suck”?

  • They do not follow the same conventions as gcloud, and do not support the same flags.
  • They come with their own set of dependencies (bq is written in Python!).
  • They do not implement some cool gcloud features, such as --impersonate-service-account from above. (Luckily, if you set auth/impersonate_service_account through configuration or environment variables, this will work!)
  • They do not always work on the same set of settings as the Cloud SDK. For example, setting CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE has no impact on gsutil and bq.

Authentication information is stored in a local SQLite database

Logging in to the Cloud SDK creates or modifies a file credentials.db inside of your configuration directory. This is is a SQLite database containing authorized user account credentials for each user you've logged in with (using gcloud auth login), and service account credentials for each service account you've activated (using gcloud auth activate-service-account). You can query this database using sqlite3:

sqlite3 credentials.db "SELECT value FROM credentials"

If you’ve authenticated with a single service account and a single user account, the result will look similar to this:

{
"client_email": "**redacted sa user**@**redacted project**.iam.gserviceaccount.com",
"client_id": "**redacted**",
"private_key": "-----BEGIN PRIVATE KEY-----\n**redacted**\n-----END PRIVATE KEY-----\n",
"private_key_id": "**redacted**",
"type": "service_account"
}
{
"client_id": "32555940559.apps.googleusercontent.com",
"client_secret": "ZmssLNjJy2998hD4CTg2ejr2",
"id_token": {
"at_hash": "**redacted**",
"aud": "32555940559.apps.googleusercontent.com",
"azp": "32555940559.apps.googleusercontent.com",
"email": "**redacted**",
"email_verified": true,
"exp": 1608746692,
"hd": "**redacted**",
"iat": 1608743092,
"iss": "https://accounts.google.com",
"sub": "**redacted**"
},
"refresh_token": "**redacted**",
"revoke_uri": "https://accounts.google.com/o/oauth2/revoke",
"scopes": [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/accounts.reauth",
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/appengine.admin",
"openid",
"https://www.googleapis.com/auth/drive"
],
"token_response": {
"access_token": "**redacted**",
"expires_in": 3599,
"id_token": "**redacted**",
"refresh_token": "**redacted**",
"scope": "https://www.googleapis.com/auth/appengine.admin https://www.googleapis.com/auth/cloud-platform https://www.googleapis.com/auth/userinfo.email https://www.googleapis.com/auth/compute https://www.googleapis.com/auth/accounts.reauth openid https://www.googleapis.com/auth/drive",
"token_type": "Bearer"
},
"token_uri": "https://www.googleapis.com/oauth2/v4/token",
"type": "authorized_user",
"user_agent": "google-cloud-sdk"
}
  • The client ID 32555940559.apps.googleusercontent.com and client secret ZmssLNjJy2998hD4CTg2ejr2 shown here are the client ID and secret of the Google Cloud SDK application, so don’t worry — these are no secrets of mine 😃.
  • I am not sure why token responses appear to be buffered for user accounts and not for service accounts 🤷. Maybe because many Google APIs support signing directly with a signed JWT bearer token instead of a normal access token, as explained here.
  • The scope https://www.googleapis.com/auth/drive is added to your user credentials when you add the --enable-gdrive-access flag to the gcloud auth login command.

That’s it! If there’s any more interesting factoids about the Cloud SDK CLI applications that you think are missing, be sure to share them in the comments!

--

--