Displaying Google Analytics metrics in your README

Custom dynamic repository badges with shields.io and Google Cloud Functions

If you’ve browsed open-source code on GitHub (or other online code hosting services), you’ve probably seen these repository badges before:

TensorFlow build status badges on GitHub
Node.js Request code coverage badges on GitHub
On GitLab…
…and BitBucket as well.

What are repository badges?

The most prevalent form of repository badges is the shields style, represented by majority of the badges in the screenshots above. There are also other forms of badges, for example the elaborate npm badge in the Request repository.

Repository badges are commonly used to show information about a repository such as continuous integration build status and code coverage, but can also be links to documentation or discussion, show the latest version of a library and how many times it was downloaded, or basically anything you want to display at a glance on your repository.

You can find out more about badges by keeping an eye out for them whenever you’re browsing code, scrolling down https://shields.io/ and look at the many examples there, or take a look at this README describing repository badges I came across.

Displaying weekly users on a badge

I was interested by badges which showed activity over time such as the number of downloads per week or number of users online, and wanted to put a badge in my Bus Eta Bot repository to show the number of unique users who used my bot over the past week, a metric I was already tracking with Google Analytics. Here’s what the end result looks like:

51 users this week!

To get started, we can use shields.io to create custom shields-style repository badges.

shields.io

shields.io is a service which generates custom svg shields badges from a URL of the form:

https://img.shields.io/badge/<SUBJECT>-<STATUS>-<COLOR>.svg

will resolve to a badge with <SUBJECT> on the left and <STATUS> on the right with a <COLOR> background. You can then embed that URL in your README to display the badge on your repository page. For example, https://img.shields.io/badge/godoc-reference-blue.svg generates the badge commonly used to link to documentation on Godoc for Go libraries:

However, putting the shields.io link directly into your README creates a static badge which can only be changed with a new commit. In order to display dynamic information, the badge itself would have to be dynamically generated.

In fact, as I only found out while writing this post, shields.io can also dynamically generate badges for certain common statistics such as:

  • Latest version and downloads over time from various package managers, eg. npm, PyPI, crates.io
  • Latest GitHub release, tag or even commits since a particular tag
  • Number of users online on Discord

You can find an exhaustive list on the shields.io website.

In addition, there are many other services which generate repository badges for you including most continuous integration and code coverage providers. However, for truly custom or non-publicly accessible information, we’ll have to generate it ourselves. While we could run our own web server to do this, in this case it seems more practical to take advantage of serverless computing with Google Cloud Functions.

Google Cloud Functions

Note: I decided to use Google Cloud Functions because my bot was already a project on Google Cloud Platform (GCP) and I was going to be pulling data from Google Analytics, but in general there’s nothing stopping you from using AWS Lambda or any other functions-as-a-service (FaaS) provider instead.

Google Cloud Functions is Google’s FaaS offering, and similar to AWS Lambda, allows you to write functions which execute in response to HTTP requests or are triggered by events from other GCP services. Currently Cloud Functions are only written in JavaScript and run on a Node.js 6.9.1 runtime, so we’ll be using JavaScript to write our function.

What we want is a HTTP function which, when called, fetches some information from somewhere, then builds a custom shields.io link and returns a 302 redirect to it. We can then embed this function’s endpoint in our README, so that every time it is triggered, a shield will be generated dynamically with our latest data. In my case, I want to get this data from a Google Analytics property.

Google Analytics Reporting API

The Google Analytics Reporting API v4 is the latest version of the Analytics Reporting API and the “ the most advanced programmatic method to access report data in Google Analytics”. We’ll be using it to just get a single number: the number of unique users over the past 7 days, although you can change this to any metric you want. You can find all the available metrics in the Dimensions & Metrics Explorer.

Authentication

Unfortunately, the Analytics Reporting API v4 only supports OAuth 2.0 authentication, rather than “just working” within the same GCP project, probably because Google Analytics is a separate service from Google Cloud Platform.

Instead, in order to access our data, we’ll have to use a service account from our GCP project which has been given the necessary permissions in our Google Analytics account.

First, we need to create a service account and obtain its private key file, which we will use to authenticate ourselves with the Analytics Reporting API. You can follow the instructions under the “Creating a service account” section in the Google documentation here: Using OAuth 2.0 for Server to Server Applications. Once you have safely downloaded the private key file, you can move on to the next step.

Next, we need to give the service account read permission to our Google Analytics account. You can do in the “User Management” section under the Admin section here.

Querying the Analytics Reporting API with Node.js

Once we have obtained a set of service account credentials and given your service account permission to read your Google Analytics data, we can test that everything is working with a short Node.js program to query the Analytics Reporting API using the Google APIs Node.js Client:

"use strict";
const google = require('googleapis');

const key = require('./credentials.json');
const viewId = 'YOUR_VIEW_ID';

function getUsers(key, viewId) {
// https://github.com/google/google-api-nodejs-client#using-jwt-service-tokens
const jwtClient = new google.auth.JWT(
key.client_email,
null,
key.private_key,
['https://www.googleapis.com/auth/analytics.readonly'],
null
);

return new Promise((resolve, reject) => {
jwtClient.authorize(err => {
if (err) {
console.log(err);
return;
}

// based on https://github.com/google/google-api-nodejs-client/issues/561
const analytics = google.analyticsreporting('v4');
analytics.reports.batchGet({
auth: jwtClient,
resource: {
reportRequests: [
{
viewId: viewId,
dateRanges: [
{
startDate: '7daysAgo',
endDate: 'today'
}
],
metrics: [
{
expression: 'ga:users'
}
]
}
]
}
}, (err, data) => {
if (err) {
return reject(err);
} else {
return resolve(data);
}
});
});
});
}

getUsers(key, viewId)
.then(data => console.log(JSON.stringify(data)))
.catch(console.error);

In the code above, ./credentials.json should be the path to your service account credentials file, and you need to replace YOUR_VIEW_ID with the View ID you are using. Running this file, we should see a JSON blob which contains the metric we are interested in (47 in the following output):

PS> node .\test.js
{"reports":[{"columnHeader":{"metricHeader":{"metricHeaderEntries":[{"name":"ga:users","type":"INTEGER"}]}},"data":{"row
s":[{"metrics":[{"values":["47"]}]}],"totals":[{"values":["47"]}],"rowCount":1,"minimums":[{"values":["47"]}],"maximums"
:[{"values":["47"]}]}}]}

Although it may look a bit messy, the actual value itself can be extracted with.reports[0].data.totals[0].values[0].

Great! We’re able to programmatically obtain the value of a metric from Google Analytics. Now we just need to write a Cloud Function which builds a custom shields.io link and redirects to it.

Writing our Cloud Function

A HTTP-triggered Cloud Function is actually just an Express.js route handler, which should be familiar if you’ve used Express before. It is simply a function which takes a request context and a response context: you access details about the incoming request through the request context and respond using methods on the response context.

For a hands-on introduction, you can follow the HTTP Tutorial on the Cloud Functions documentation to create a Hello World Cloud Function which responds to any incoming HTTP request with some text.

Previously, we managed to get the value of our metric from Google Analytics. Using using a shields.io URL, we can create a custom badge containing it:

https://img.shields.io/badge/weekly%20users-51-yellow.svg

becomes

All we need to do now is to modify our code above to export a HTTP handler which returns a redirect to a shields.io URL containing the latest value of our metric, instead of just printing the value to standard output.

Replace

getUsers(key, viewId)
.then(data => console.log(JSON.stringify(data)))
.catch(console.error);

with

exports.function = function (req, res) {
return getUsers(key, viewId)
.then(data => {
const value = data.reports[0].data.totals[0].values[0];
return res.redirect(302, `https://img.shields.io/badge/users-${value}%2Fweek-yellow.svg`);
})
.catch(err => {
console.error(err);
return res.sendStatus(500);
});
};

The code above uses template literals, a relatively newer JavaScript feature, but all we are doing is interpolating our metric value into a URL string.

Now that we have written our cloud function, we can deploy it to test if it works. Assuming you have the following folder structure:

|- credentials.js
|- index.js
|- package.json

you can deploy your function with the gcloud command:

gcloud --project [YOUR_PROJECT_ID] beta functions deploy [YOUR_FUNCTION_NAME] --stage-bucket [YOUR_STAGING_BUCKET_NAME] --trigger-http --memory 128MB

In the command above, YOUR_PROJECT_ID is the ID of your GCP project, YOUR_STAGING_BUCKET_NAME is a Cloud Storage bucket to store your function code, and YOUR_FUNCTION_NAME is a name you choose for your function. Our main value in package.json is set to index.js, while our exported HTTP handler is called function, which is the default which Cloud Functions looks for. We also set the allocated memory for our function to the minimum of 128MB instead of the default to 256MB since we probably don’t need much memory. The full list of options for gcloud beta functions deploy can be found in the Cloud SDK documentation.

After successfully deploying your function, within the output you should see the URL for your function endpoint:

...
httpsTrigger:
url: https://us-central1-YOUR_PROJECT_ID.cloudfunctions.net/YOUR_FUNCTION_NAME
...

If you visit this URL, you should see the same badge from above, but now it’s dynamic! Time to add it to our README.

Embedding badges into your README

This is the easiest step, especially if you already have other badges in your README. To do this in Markdown, we just it like any other image:

![Weekly users](https://us-central1-YOUR_PROJECT_ID.cloudfunctions.net/YOUR_FUNCTION_NAME)

This embeds our new badge into our README, and gives it some alt-text.

Note on caching

It is possible that your badge may be cached by the repository host. For example, GitHub proxies user images through its camo cache (announcement), and issues have been raised before about stale repository badges. GitHub said that they respect Cache-Control and ETag headers, and shields.io sets cache-control: max-age=86400, so it is possible that our badges will be cached for 24 hours.

However, since in this case I’m showing a weekly user count which only updates daily, a 24 hour cache time is not very significant and I didn’t pay much attention to it. Depending on your use case, you may have to find a way to keep your badge fresh.

Summary

In this post, we

  1. Used shields.io to create custom repository badges,
  2. Retrieved a Google Analytics property value using the Analytics Reporting API v4,
  3. Created a Cloud Function which generates a custom repository badge using shields.io containing the real-time value of our Google Analytics property, and
  4. Embedded the generated badge into our README.

You can find the source code for the Cloud Function above at https://gist.github.com/yi-jiayu/e1fccd54a3ef7a4eed6c2a999420b50c.

While my specific use case involved Google Analytics and I used Google Cloud Functions, all of the techniques used in this post can be generalised to other purposes, for example pulling data from Google Analytics for your own use, or creating custom repository badges from other data sources using AWS Lambda instead.

Thanks for reading, and as usual, if you’re in Singapore and take the bus often, do check out my Telegram bot @BusEtaBot for checking bus etas!