Building a custom Google Data Studio connector from A-Z. Part 2— OAUTH, calling API & caching.

In the previous article I showed you how to setup a simple connector and feed it with some mocked data. This time we’ll go further — you’ll learn how to authorize the connector to perform requests to an API that uses OAUTH, then fetch the data and cache it to reduce the API quota usage and speed up the connector.

Full code for this part can be found at https://github.com/Bajena/spotify-gds-connector/tree/Tutorial2

1. Oauth authorization flow

Spotify, like many of popular services, uses Oauth standard for API calls authorization. In order for our connector to work with Spotify’s API we need to perform a few steps.

Create a Spotify app

First we need to create an application in Spotify’s developer dashboard — this way we’ll get a client id and client secret required for the connector. Also, a cool thing about Spotify’s dashboard is that they let you view stats about who’ and how intensively is using your application.

Spotify application dashboard

Set correct auth type in the connector

In the previous part I instructed you to set auth type to "NONE" temporarily as we didn’t require any authorization to test our connector. Now we need to modify the getAuthType function to following:

function getAuthType() {
return { type: "OAUTH2" };
}

Add Oauth2 Google Script service

The best and easiest way to handle the oauth flow is to add a great apps-script-oauth2 library to our Google script.

Adding a library is very simple — go to Resources->Libraries and add a library with a following id: 1B7FSrk5Zi6L1rSxxTDgDEUsPzlukDsi4KGuTMorsTQHhGBzBkMun4iDF

Now our script will have access to Oauth2 object which will be a blackbox with oauth implementation.

Add required connector functions

Great, so the library is going to do the heavy lifting for us, but we still need to add a few functions required by Google to perform Oauth2 flow. These functions will mostly just execute the library functions with params specific for the Spotify API.

The required functions are:

  • get3PAuthorizationUrls — this function has to return the URL for Spotify API authorization
  • isAuthValid — this function is responsible for initiating the flow to fetch API tokens, return the cached token or refresh the expired token.
  • resetAuth — this one clears the credentials. E.g. when we want to login as another Spotify user.

The functions reference and official Google tutorial can be found here:

When I got familiar with the oauth library and required functions I just needed to apply the knowledge specifically to my needs. I followed the authorization guide on Spotify’s API page:

Basically all the info I needed was:

I composed all the necessary code in Oauth.gs file. Check it out, I suppose it’s pretty much self-explanatory.

Test the flow

That’s it — go ahead and try creating a new Google Data Studio report now — you’ll be prompted to authorize. Then login with your Spotify credentials and boom, you should already be able to create the report :)

2. Fetching API data

We’re now authorized to access the API, so let’s get our recent plays using the endpoint (https://developer.spotify.com/console/get-recently-played).

The function we’re going to write will fetch all the pages with data in selected timeframe. So for given startDate and endDate we’ll do the following:

  1. Call https://api.spotify.com/v1/me/player/recently-played?before=endDate using Google’s in-built UrlFetchApp library (https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app).
  2. Add the items with played_at greater thansinceDate to the results list. If any of the items has played_at smaller than sinceDate then stop.
  3. If there are more items available then the response will include next link. We’ll use it to fetch the next page or stop if it’s not present.

Check out the following gist for the full code:

Test it!

The code we have is enough to check the real data, so let’s see how it works:

My recent plays listed

As you can see we received a list of recent plays. Unfortunately the endpoint is currently limited only to recent 50 tracks, but if it supported more then our code would handle it as well as we already have the pagination done.

Note that this code was only the happy path, so if any of the requests would crash in the middle then our connector would blow up with an error :D

3. Data caching

After a bit of playing with GDS you’ll notice that whenever you change the report, e.g. to display different dimensions or metrics in the table, it’ll send another request to our connector.

In our case it’s pretty ok as there are not a lot of Spotify API requests needed to display the report, so it’s rather a small chance of hitting the rate limit and the reports are rendered in a reasonable time. Some connectors however, like the one we built in Leadfeeder, require looots of data, so hitting the API every time we need to add a new column to the table is not acceptable.

Luckily Google Scripts are cool enough to provide a caching service that’ll allow us to decrease the number of requests to the API and speed up the rendering.

Let’s start with a simple approach — we’ll try to store the items in cache using CacheService.getUserCache()

Let’s create a following service and place it in DataCache.js

We’ll also change to main connector code to first try fetching data from cache:

Ok, let’s see our log output and check if the data got cached correctly:

Naive approach doesn’t work :(

Unfortunately the data is too big to store under one cache key. It turns out that the max size of the data is 100KB, so we’ll have to come up with another idea.

The solution I invented simply cuts the data in 100KB chunks and puts them under different keys. The connector will iterate over the pages and stop if the chunk under specific index is not present or if its content is empty. Check out the listing below.

Check out the logs again. It works now! First it fetches data from the API, stores the data in cache and in another connector request data gets fetched from cache.

Logs of the working caching mechanism

4. Summary

That’s it, we now have an an example of a fully blown connector able to authorize using OAUTH flow, fetch data using a JSON rest API and make it even better by adding a custom caching mechanism.

Are we there yet? Some may say yes, however an ambitious programmer will tell you what’s missing. A good piece of software should include also unit tests and a setup of a CI tool that’ll run the tests for us and deploy it to google servers.

In a real world we would start with writing the tests before the code itself. However, I decided to postpone testing and describe it in part 3.