Google Drive Policy Alerting And Remediation v2
Speed and scale with Go multi-threading
This article picks up where the earlier Apps Script article left off.
To recap, the intent is to detect, notify of, and remediate folder and file permissions which are out of policy.
- It works along Forseti lines by allowing you to define desired sharing policy by folder and domain and then reconciling against this; policy is inherited downwards to match Drive’s inheritance.
- Configuration switches allow you to remove permissions or just notify.
The original Apps Script tool has three key limitations:
- It reports files as out-of-policy which have multiple ancestors, and for which the sharing was permissible on one of the branches, ie. false positives.
- Policy is defined in code, which makes it difficult to maintain, particularly for a team.
- It is slow, so it exceeds Apps Script’s execution time limits for large folder structures.
The first limitation is addressed by only validating once all permissions have been accumulated, rather than along each branch.
The second limitation is addressed by reading policy from a Google Spreadsheet.
The third limitation is addressed by Go; similarly to the load testing utility, Go routines’ multi-threading capability is a huge advantage here: it allows the code to navigate all branches of the folder hierarchy in parallel, reducing execution time from half an hour to tens of seconds.
However, a lot of complexity which is abstracted by Apps Script must be addressed explicitly in Go. This article describes some of the lessons learned along the way; you can go directly to the code here.
As always, please test to ensure this behaves as you expect; that goes double for this tool, as you’ll read in the Drive section below.
Apps Script automatically discovers scopes and takes you through the appropriate authorization dialogue; the Go libraries abstract the user interaction and authorization flow, but you still need to explicitly define the scopes yourself and invoke authorization with the appropriate credentials.
Google Cloud doesn’t provide the ability to grant G Suite scopes (eg. Drive) to a service account, so you either need to have admin privileges on the domain to grant domain-wide delegation of authority to a service account, or authorize as a user via 3-legged OAuth as shown in the quickstart. Since this tool is targeted at departmental users, it uses 3-legged OAuth.
Apps Script also automatically provides Stackdriver logging; you need to implement this yourself in Go. You can grant Stackdriver scope to a service account as described in the documentation; but that would require managing multiple credentials: service account credentials for Stackdriver, and OAuth credentials for the Google Drive API. Leveraging the OAuth flow for Stackdriver auth, as described in Junpei Kawamoto’s post, allows you to consolidate on a single credential. You still need the project id to authorize Stackdriver; rather than hard-coding it, you can extract it from the OAuth credential using the OAuth package’s tokenFromFile() function; you need to include a copy since the function isn’t exportable.
So the auth code looks a bit like this:
Let’s talk about least privilege a moment. The utility uses the least privileges required for its usage, ie. Spreadsheets and Drive read scope, and logging write scope by default. Only if you specify the flags to fix permissions or send notifications does it request the scopes for those operations.
As described in Go Blog’s error handling section, the error type is an interface type, and the most commonly-used error implementation is the errors package’s unexported errorString type.
However, you’ll notice that for Google API errors, the error string includes an error code. This code is helpful for error handling; you could parse the string for the error code, but that’s inelegant and fragile; instead, you want to assert the right interface type. The key here is in the other part of the error message: “googleapi: Error”.
You’ll want to import the google.golang.org/api/googleapi package and assert the appropriate type. You can either inspect the package, or discover the struct with a utility like Examiner (which is also an illuminating example of reflection); you’ll then assert the Code type in your error handling:
Thanks to RayfenWindspear for the assist with this…
The various Drive API versions differ in meaningful ways. The methods and the response schema (field labels, and their level in the hierarchy) differ; most meaningfully for this utility, the actual permissions results returned differ.
- The v2 API returns permissions information for files for which the user running the utility is a reader; the v3 API only returns this information if the user is the editor.
- The v2 API always populates the domain field; in the case of shares to individual users this is the domain of the user’s email address. The v3 API only populates the domain for domain shares; you need to parse the user’s email address to determine the domain in the case of shares to individual users.
The utility uses the v2 API; you may wish to update to the v3 API by commenting/uncommenting the relevant lines. This brings up an interesting point: because of Go’s strict compile-time checking, you can’t encapsulate the API calls for the different versions in If statements which check a global variable.
In addition, the APIs’ behaviours are evolving, so you’ll want to periodically regression test your utility against a known set of permissions.
Drive quotas can also be a source of fun: Go routines’ multi-threading throughput exceeds the default Drive API quota for large folder structures. The default quota is described as 1k queries per 100 seconds per user, but appears to be implemented over shorter timeframes since without a quota increase the utility will exceed quota with fewer than 200 queries; exponential backoff just increases load for batch utilities with constant load, so the utility implements a command line wait flag which sleeps each go routine the specified number of seconds.
Guessing this is sounding familiar by now; unlike Apps Script, you need to roll your own in Go. You can use Go’s HTML templates for a result similar to those of Apps Script, but still need to do a lot of the actual mail composition mechanics yourself. Mohamed Labouardy’s post provides a good introduction.
Your mail function will look something like this:
One of the interesting things you’ll notice here is the calling syntax for variadic functions.
If you’re using cron, you need to use a shell file to change to the appropriate directory prior to running the tool; otherwise it doesn’t read the token file properly even if you hard-code the path, unlike the credential file.
So in crontab -e you’d have something like the following to run your bash file every day at midnight and route console output to a log file.
Structs and Reflection
This utility uses the mow.cli command line package, which assigns flag values to string pointers; to be able to display these in the notification email requires a way of iterating over them; one of the ways to achieve this is to assign them to a struct and then iterate over the struct; turns out that iterating over the values of a struct of pointers is non-trivial. This Stackoverflow post provides a helpful example (this post provides a simpler approach, but it requires that the struct fields be exportable); because the struct fields are themselves pointers, an extra invocation of Elem() is required.
So the CLI and reflection code looks something like this:
Map Key Existence Check
This utility uses loads of maps; this means there’s lots of checking whether a key exists, which is described in this Stackoverflow post.
Good extensions of the utility would be to:
- Support policy based on Google Groups as well as domains.
- Support Team Drives.
- Push the notification map to a sheet so you can scan hourly and only notify daily.