Faster ServiceAccount authentication for Google Cloud Platform APIs

A couple weeks ago I wanted to understand the AccessTokenCredentials flow that certain google cloud APIs supported.

It’s described here in the addendum of our developers oauth documentation as a specific optimization over the “normal” oauth flows providers maintain. The optimization is pretty dramatic so I thought I’d write an article describing it and how its used…and finally a quick bakeoff to demonstrate its advantage.

First, a quick background on serviceAccount oauth flows

ServiceAccount Oauth2 Flow

A serviceAccount on GCP can take many shapes but it normally represents a non-user accessing a system. Think of it as machine accounts that require access to a service. When a system needs to access a GCP service (eg Pub/Sub), it needs to acquire an Oauth2 token described here

Essentially, a cryptographic private key is issued for a ServiceAccount and that key is used to sign a JSONWebToken (JWT) which includes some claims and information about the token capabilitites requested. At that point, the JWT is trasmitted to Google which verifies the claims and identity of the service account. Once google verifies the identity, it issues an access_token for the ServiceAccount with the scopes the original JWT requested and returns that token back to the client. The client application at that point has the bearer access_token to make the request to the service (Pub/Sub, in this example).

Note that this flow involves a roundtrip exchange for the JWT for an access_token ... so what if we could bypass that one roundtrip?

AccessTokenCredential Flow

Allright, so how can we optimize the flow above if we already have a crypto key we can sign with? How about we create a JWT with a specific audience that is the service we intend it for? That is the optimized flow we're dealing with in this article and that is the flow that will save us this roundtrip.

The golang sample here basically reads the private key and uses it to sign a JWT with a specific aud: field that denotes the serivce its intended for.

This flow saves a round trip call but only applies to specific services within Google Cloud. These specific services utilize a different backend system which allows for this abridged flow. For example, the services listed below are the only ones that allows for this:

If you’re interested, the JWT that is signed by the service account uses the aud: field that describes the target service itself:

{
"alg": "RS256",
"typ": "JWT",
"kid": "cc241d179abcbea44d0c69355bab01315a1ea45d"
}.
{
"iss": "access-token-creds@mineral-minutia-820.iam.gserviceaccount.com",
"aud": "https://pubsub.googleapis.com/google.pubsub.v1.Publisher",
"exp": 1535333036,
"iat": 1535329436,
"sub": "access-token-creds@mineral-minutia-820.iam.gserviceaccount.com"
}

Example Implementation

If you want to try this sample out, you would need to first create a service account and download its JSON private key. Once you do that, enable IAM access for that service account to Pub/Sub Viewer role as shown below:

At that point, download the JSON certificate and initialize the client:

git clone \
https://github.com/salrashid123/gcpsamples.git
cd auth/service/jwt_access_token/golang/
export GOPATH=`pwd`
go get cloud.google.com/go/pubsub \
golang.org/x/net/context \
golang.org/x/oauth2/google \
google.golang.org/api/iterator \
google.golang.org/api/option \
github.com/golang/glog

And then run the sample:

go run src/accesstoken.go \
-logtostderr=true \
-v=2 \
-keyfile service_account.json \
-project mineral-minutia-820

In which response is the latency in milliseconds.

I must note: this whole procedure ONLY applies to the inital acquisiton of the access_token. In normal usecases, you can reuse an access_token or even the id_token until it expires (normally 3600s). What that means is the latency described below is only to get the first token for most usecases.

Bakeoff!

The following sample runs through the abridged flow against the standard ServiceAccount oauth flow where the full cert is loaded already and the measure is the Percentile Latency

I ran each mechanism 100 times on the same computer separately (and yah, trust me, the workstation where Iran it had lots of compute and very high network bandwith to GCP endpoints!).

  • ServiceAccount Flow
╔════════════════╦════════════════════╦
║ Percentile ║ Latency (ms) ║
╠════════════════╬════════════════════╣
║ 50 ║ 585 ║
║ 90 ║ 613 ║
║ 95 ║ 637 ║
║ 99 ║ 742 ║
╚════════════════╩════════════════════╩
  • AccessToken Flow
╔════════════════╦════════════════════╦
║ Percentile ║ Latency (ms) ║
╠════════════════╬════════════════════╣
║ 50 ║ 289 ║
║ 90 ║ 442 ║
║ 95 ║ 448 ║
║ 99 ║ 572 ║
╚════════════════╩════════════════════╩

As you can see, in any bracket, the lack of the additional roundtrip makes a difference in getting and making the same API call!

Appendix

Language Support

The following describes various other language bindings for the same abridged flow:

References

Additional references for oauth and service accounts:

You can find the source below or under the git repo i maintain here:

Source:

AccessTokenCredentials:
ServiceAccountCredentials: