Building Your Own Google Drive Webhook for Instant Notifications (so you don’t have to refresh inbox every 5 seconds)
Table of Content:
First off, let’s just admit that the API’s are eating the digital world! We now can integrate almost anything with everything else, as long as access control is securely configured. At Docker data team, we are constantly testing best practices for automating business reporting processes. We rely on Segment for most of our data sources, and it does provide a sufficient coverage of our needs.
There are a few sources that we need to build native pipelines for event ingestion. For instance, Marketing team is storing most of their planning and budgeting data on Google Sheets. We could write a script that’s making hourly API calls to read the sheet, but a better approach would be to set up an incoming webhook that notifies our pipeline, whenever there is changes to the sheet, and the pipeline updates these changes accordingly.
The Google Drive API allows you to watch for changes on the source files such as Google Sheets, Google Docs, or Google Slides, and the push notifications can be built into a web app. The official docs can be found here: https://developers.google.com/drive/api/v3/push. What I’m about to show you is what to do if you are not building an app (therefore, no hosted domain or website). We could however utilize the HTTP Functions provided by Google Cloud Platform. This Cloud Function serves as the webhook callback receiver.
Google Drive API lets us identify three types of changes in the drive: change entry after file moves to a shared drive, change entry for individual items in a shared drive, change entry for lost access permission. To retrieve change logs, we could set up push notifications, and then pass a page_token
to the changes.list()
method. In our case, we would want to set up push notification by using the changes.watch()
method. This method doesn’t provide details about each change, however, it sends a webhook payload to our Cloud Function that is listening. It only takes 5 steps to set this up, so let’s dive in!
Step 1: Set up Cloud Function and verify URL domain on Google Search Console.
a. Create an HTTP Function by logging in GCP console, and select Cloud Functions:
b. Set your function trigger to HTTP
, and change name as you like, but keep note of the url that’s provided by Cloud Functions:
c. Deploy this HTTP Function for now. We will come back and update the function once we get the HTML tag
. Head over to the Google Search Console to get the HTML tag
for our HTTP Function.
Step 2: Verify ownership of the URL domain (provided by HTTP Function).
a. The Google Search Console will verify that you own the HTTP domain that was generated by the Cloud Functions from previous steps. Head over here: https://search.google.com/search-console/about and click Start Now
.
b. Go to the main menu and click on the left corner search property
and it will allow you to Add property
:
c. You’d be asked to select the property type for your application. In this case, select URL Prefix
:
d. Now we need the URL that was provided by HTTP Function when we first set up. Go back to GCP console, locate Cloud Functions, and go into the HTTP Function page. Under the tag Trigger
you’ll find the URL:
e. Copy this URL and go back to the previous Google Search Console page. Paste it under the URL Prefix
that we were previously on. Click open the HTML tag
. You’ll see a meta tag like this:
f. Do not verify yet. Copy this meta tag and paste it back to our HTTP Function. Hard code this meta tag into an html statement, and pass it to the main function that is being called:
g. Deploy your HTTP Function. It takes a few moments. Once it’s done, we will see it from the console:
h. Once the HTTP Function has been successfully deployed, go back to Google Search Console and hit Verify
. You should receive this message indicating a successful verification of the HTTP Function’s URL:
Step 3. Register domain.
a. Back to GCP console, but this time, we will utilize the Google APIs and Services. Select Domain verification
under the APIs and Services
:
b. Hit Add Domain
and we are now ready to configure this webhook that will be coming from the Google Drive API:
c. Pass in the same URL from our HTTP Function, and click ADD DOMAIN
. And congratulations, you’ve just verified your HTTP Function!
Note: here’s more docs from Google Search Console on setting up verification for domains: https://support.google.com/webmasters/answer/9008080?visit_id=637080699819747288-3499575662&rd=1
Step 4. Getting credentials and access token.
You’ll need a service account for your app to be granted access by the Google Drive. Simply go to Service Account page on GCP Console, and obtain one. Keep it hidden away from any commit you’d make to GitHub repos. Here’s a Python snippet that uses oauth2client.ServiceAccountCredentials
to generate access token by passing in the service account credential json file.
from oauth2client.service_account import ServiceAccountCredentials
from oauth2client.client import OAuth2Credentialscredentials = ServiceAccountCredentials.from_json_keyfile_name(
'gsuite_svc_acc.json', scopes=['https://www.googleapis.com/auth/drive.file'])
access_token_info = credentials.get_access_token()print(access_token_info.access_token)
Step 5. Making watch
requests to the resources that you want to receive push notifications from.
On the Google Drive docs, an example is provided:
POST https://www.googleapis.com/drive/v3/files/fileId/watch
Authorization: Bearer auth_token_for_current_user
Content-Type: application/json
Here’s my Python code that’s making the same Post request:
import uuid
import requests
import json# this is a randomly generated id to be passed to the payload
channel_id = str(uuid.uuid4())# setting up scope for
SCOPES = ['https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.readonly',
'https://www.googleapis.com/auth/drive']# get secret token from step 3
token = access_token_info.access_tokenheader = {
'Authorization': f'Bearer {token}',
'Content-Type': 'application/json'
}body = {
"id": channel_id,
"type": "web_hook",
"address": f'{your_http_function_url}'
}file_id = f'{your_google_file_id}'
r = requests.post(url=f'https://www.googleapis.com/drive/v3/file/{file_id}/watch?pageToken=766',
data=json.dumps(body), headers=header)
You’d receive an HTTP 200 OK
status code if this all worked well. A sample message body provided by Google Drive docs look like this:
{
"kind": "api#channel",
"id": "01234567-89ab-cdef-0123456789ab"", // ID you specified for this channel.
"resourceId": "o3hgv1538sdjfh", // ID of the watched resource.
"resourceUri": "https://www.googleapis.com/drive/v3/files/o3hgv1538sdjfh", // Version-specific ID of the watched resource.
"token": "target=myApp-myFilesChannelDest", // Present only if one was provided.
"expiration": 1426325213000, // Actual expiration time as Unix timestamp (in ms), if applicable.
}
… (drums) and you’ve got your Cloud Function hooked to the Google Drive files that you are interested in getting notifications from! From here onward, you can easily monitor these metadata changes by sending the webhook messages to your team’s Slack channel, customize these messages (checkout my other story about how to send customized bot messages from GCP StackDriver logs here), and extract the change logs, content changes, or simply making API call to extract the entire file again.
*Bonus Step: If your app uses service account to access GSuite data, a GSuite admin can help delegating domain-wide authority to the service account (Link). Follow these 7 steps, and don’t forget to Enable G Suite Domain-wide Delegation:
Now that your app is ready to listen to these webhooks notifying you of file changes to the Google Drive, you can kick back, make some Chai tea, and wait for you Slack messages!
Thanks for reading. Happy hacking!