Simple Clickstream Tracker System using Google Cloud Platform

Ridwan Fajar
Google Cloud - Community
6 min readMar 16, 2018
Ciherang River, Sumedang Regency, West Java, Indonesia

Overview

I was inspired by cloud service that has name HeatMap, which provide clickstream services start from storing you clickstream data until analytical service with their dashboard. In that case, I want to build a similar system on backend side and I will use Google Cloud Platform (GCP) services to implement that kind of technologies.

I will build the infrastructure for that clickstream technology on top of these services:

  • Cloud Function, both for producer and consumer. Producer will be triggered by HTTP Trigger, on the other hand the consumer will be triggered by Cloud Pub/Sub Topic Trigger
  • Cloud Pubsub, for messaging between the Producer and Consumer
  • Cloud Datastore, for the terminal of the payload journey and that payload is stored within that services
  • StackDriver, help you to monitor the logs from Cloud Function execution

The diagram below describes the relation between the component of this clickstream tracker on GCP:

Clickstream Tracker Architecture on Google Cloud Platform

You can follow the instructions below to deploy the clickstreamTrackerGCP that exist in this repository: https://bitbucket.org/ridwanbejo/clickstreamtrackergcp

Instructions

1. Cloud Function for Producer

- Create new Cloud Function with name `storeClickStreamProducer` and memory size with `128 MB`
- Choose HTTP Trigger on trigger section to receive HTTP request from client such as from web browser, postman, or from another services.
- At the `index.js` section please copy the code within index.js on this repository

// Imports the Google Cloud client library
const PubSub = require(`@google-cloud/pubsub`);

// Your Google Cloud Platform project ID
const projectId = 'serverlessid';

// Instantiates a client
const pubsubClient = new PubSub({
projectId: projectId,
});

exports.storeClickStreamProducer = (req, res) => {
if (req.body.mouse_position_x === undefined && req.body.mouse_position_y === undefined) {

res.status(400).send('Bad message!');

} else {
const dataBuffer = Buffer.from(JSON.stringify(req.body));
const topicName = "clickstreamTracker";

pubsubClient
.topic(topicName)
.publisher()
.publish(dataBuffer)
.then(results => {
const messageId = results[0];
console.log(`Message ${messageId} published.`);
})
.catch(err => {
console.error('ERROR:', err);
});

res.status(200).send(req.body);

}
};

- Move to `package.json` tab then copy this JSON structure into the text area:

{
“name”: “storeClickStreamProducer”,
“version”: “0.0.1”,
“dependencies”: {
@google-cloud/pubsub”: “0.16.4”
}
}

- Cloud Function will install the NPM package that required by Cloud Function automatically
- Type `storeClickStreamProducer` on `Function to execute` input field to pointing the function within the code into Cloud Function runtime. Finally hit the Save Button!
- You can do anything after you create the Function. You can edit, delete, copy, test, and view the logs of the function
- Now your Cloud Function is connected to Stackdriver automatically and this function could send the payload to PubSub. But you need to create the PubSub topic first.

Producer Code and Configuration
Producer Monitoring Dashboard
Producer Log of Execution

2. Topic and Subscription on PubSub

- Open the PubSub menu on left-side menu
- Create new topic with name `clickstreamTracker`
- Add permission with `PubSub Admin` pointing to your account to make your cloud function drop the payload on that PubSub
- Thew new topic is ready to receive subscription from another GCP services
- We will wire the Cloud Function with PubSub in next section.

Topic clickstreamTracker created in PubSub Dashboard

3. Cloud Function for Consumer

- Create new Cloud Function with name `storeClickStreamConsumer` and memory size with `128 MB`
- Choose Cloud Pub/Sub topic wihtin Trigger section
- Choose Pub/Sub that was created in prior section, the topic is `clickstreamTracker`
- At the `index.js` section please copy the code within consumer.js on this repository

// Imports the Google Cloud client library
const PubSub = require(`@google-cloud/pubsub`);
const Datastore = require('@google-cloud/datastore');

// Your Google Cloud Platform project ID
const projectId = 'serverlessid';

// Instantiates a client
const pubsubClient = new PubSub({
projectId: projectId,
});

// Creates a client
const datastore = new Datastore({
projectId: projectId,
});

const subscriptionName = 'myclickstreamTrackerSubscription';
const timeout = 60;

// References an existing subscription
const subscription = pubsubClient.subscription(subscriptionName);
const uuidv4 = require('uuid/v4');

exports.storeClickstreamConsumer = (event, callback) => {
// The Cloud Pub/Sub Message object.
const pubsubMessage = event.data;

// Do something with the message
console.log(JSON.parse(Buffer.from(pubsubMessage.data, 'base64').toString()));

// The kind for the new entity
const kind = 'clickstream';

// The name/ID for the new entity
const name = uuidv4();
console.log(name);


// The Cloud Datastore key for the new entity
const clickstreamDataKey = datastore.key([kind, name]);

// Prepares the new entity
const clickstreamData = {
key: clickstreamDataKey,
data: JSON.parse(Buffer.from(pubsubMessage.data, 'base64').toString()),
};

// Saves the entity
datastore
.save(clickstreamData)
.then(() => {
console.log(`Saved ${clickstreamData.key.name}`);
})
.catch(err => {
console.error('ERROR:', err);
});



// Don't forget to call the callback.
callback();
};

- Move to `package.json` tab then copy this JSON structure into the text area:


{
“name”: “storeClickStreamConsumer”,
“version”: “0.0.1”,
“dependencies”: {
@google-cloud/datastore”: “1.3.4”,
@google-cloud/pubsub”: “0.16.4”
}
}

- Cloud Function will install the NPM package that required by Cloud Function automatically
- Type `storeClickStreamConsumer` on `Function to execute` input field to pointing the function within the code into Cloud Function runtime. Finally hit the Save Button!
- You can do anything after you create the Function. You can edit, delete, copy, test, and view the logs of the function
- Now your Cloud Function is connected to Stackdriver automatically and this function could receive payload from PubSub then store that payload to Cloud Datastore
- Now you can select `View Logs` tabs to see the incoming message from Pub/Sub you have been selected.

Consumer Code and Configuration
Consumer Function Monitoring
Consumer Function Log

4. Cloud Datastore

- There are no configuration too much in the dashboard of Cloud Datastore, you can just select what Entities that you want to see
- You can browse the dataset on `clickstream` entities within the dashboard as long the payload is successfully stored on Cloud Datastore

How it works?

Producer and Consumer for Clickstream Tracking is ready

First, we send the payload that has structure like this example below:

{
“mouse_position_x”: 1241,
“mouse_position_y”: 6678,
“sourceUrl”: “https://www.example-online-shop.com/product/12345",
“createdAt”: “2018–03–11 23:00:00”
}

Basically the producer receives payload that contain mouse position, the URL page that mouse visited, also the event created time from the browser. And then, with HTTP Trigger, Cloud Function that become as a producer receive that payload and queued that message to Cloud Pub/Sub.

Send the payload to Clickstream Producer on Google Cloud Function

Queued messages will be consumed by Cloud Function that become as a consumer, then parse the payload from Pub/Sub, finally the payload will be stored on Cloud Datastore in the `clickstream` entity. Later, you could build anything based on dataset that remain in Cloud Datastore using GQL or Cloud Datastore library.

Browse the clickstream data that have already stored on Cloud Datastore Browser
Browse the clickstream data detail on Cloud Datastore Browser

Thanks for following my post. I also write this post in Indonesian version on Serverless.ID Medium.

Big thanks for Tajhul Faijin Aliyudin who allowed me to use his GCP free tier account, also guidance from fajri abdillah who established the Serverless.ID Community.

--

--