React-based user behavioral tracking

Zhaojun Zhang
4 min readOct 30, 2016

--

Before you read this article, we assume that you have basic familiarity of Javascript, React and its context feature.

User behavioral tracking through our web application can be really tedious and problematic. As a member in the Analytics team in Coursera, we are constantly fighting with bad data quality and improving our tracking infrastructure. Our product team’s main focus is always building and experimenting product features to improve our learners’ learning experience. Therefore, adding correct tracking code for user behavioral actions on our website is gradually becoming a barrier for both product and analytics team:

  1. analysts need to provide a list of user actions they care about
  2. product engineers will go through them to implement the tracking code,
  3. after it is deployed into production, analysts verify the data, if there is missing information, engineers go to step 2 again.
  4. If everything looks good, analysts use the data for their analytical tasks.

As you might guess, this causes a few productivity issues:

  • Engineers can’t fully focus on building new product features. Adding tracking code is a distraction to our product engineers.
  • The lifecycle of adding new tracking is really long, causing frustrations and inefficiency for analytics work.

In Coursera, we love React. In this article, I am going to briefly share about how we build our React-based tracking infrastructure to overcome this issue. Here are two key concepts we developed:

  1. Use React context to pass down data to be recorded
  2. Build a set of common components that automatically handles tracking.

Fundamentally speaking, tracking is very straightforward itself, the intricacy comes from when it interacts with our product code. not uncommon to see the following code in our web application:

tracker.track({event_key, event_value})

The code is quite self-explained, we want to track an event_key with extra attributes information stored in the event_value. E.g., when a learner clicks an enroll button in our home page, we might call

tracker.track({
event_key: "home.page.click.enroll",
event_value: {
user_id: 12345,
course_id: course_id,
time: 2016-02-23
}
})

We want to address an “API leakage” problem caused by tracking needs. For example, assuming we have two components, CourseList and CourseCard. CourseList is a presentation layer of a course list, and in its render function, it renders a list of CourseCard, and the CourseCard shows the information about a certain course. When people click CourseCard, we want to track which list the CourseCard is in, therefore, we need to pass down a course list id to CourseCard only for tracking purpose. If the clickable component is not rendered directly in CourseCard, we have to pass down the course list id to its child components again.

export CourseList = (props) => {
return (
<div>
{props.courses.map(courseId => {
<CourseCard
courseId={courseId}
courseListId={courseListId} //<-- Only used for tracking
/>
}
</div>
)
}

Somewhere in the child component under CourseList, you might see the following code

tracker.track({
event_key: "click.course",
event_value: {
courseId: this.props.courseId,
courseListId: this.props.courseListId
}
})

As you can see, the courseListId field has nothing to do with the function in CourseCard. You could imagine, if analysts require lots of context information to track, the code could potentially be quite tedious and verbose to write and maintain.

Therefore, we use React context to share information from parent component to its children components.

const CourseList = (props) => {
return (
<div>
{props.courses.map(courseId => {
<CourseCard
courseId={courseId}
/>
}
</div>
)
}
export createTrackedContainer(CourseList, (props) => {
courseListId: props.courseId
});

The createTrackedContainer (which is a helper function we build) inject the courseListId information into the context of its (direct or indirect) children components. In our application, we reserve an object field in context named _eventData to keep all information injected by createTrackedContainer. Later, I will explain how we grab the information from the context, but now you can see that all of the prop fields that are relevant to its function.

The other benefit of this implementation is that if we remove the createTrackedContainer code, nothing will break.

Now we need to build a set of tracked components that people can use to build their applications.

There is no magic in the tracked components, here is a minified version of our button component for tracking click actions:

class TrackedButton extends React.Component {  static propTypes = {
trackingName: React.PropTypes.string.isRequired,
// this allows people to directly record addition fields
// without using createTrackedContainer funciton.
// This is very handy for the direct parent component to pass
// down data to this button
data: React.PropTypes.object,
onClick: React.PropTypes.func,
children: React.PropTypes.node,
}
static contextTypes = {
_eventData: React.PropTypes.object
}
trackClick = (e) => {
tracker.track(
'click' + this.props.trackingName,
// the following code will send the current context to
// our backend
{
context: ...this.context._eventData
data: ...this.props.data
}
);
if (this.props.onClick) {
this.props.onClick(e);
}
}
render() {
return (
<button
onClick={this.trackClick}
>
{this.props.children}
</button>
);
}
}

These tracked components are maintained by Analytics team, and we ask our engineers to implement their view code using these components.

We think this will work well with our React-based learning platform,

  • Product engineers don’t need to worry about tracking: the only thing they need to do is to use our tracked components. The interface is almost the same with the React HTML elements. The trackingName and data are two additional fields.
  • This implementation is quite adaptive. Our product is changing all the time. The tracking is not coupled with our render functions at all. If a component is moved to a new place, the context will change automatically.
  • Analysts can potentially work on this API directly without much javascript experience. This means that if analysts want to record additional information or modify existing tracking function, they don’t need to depend on a product engineer. (We are still experimenting on this.)

We are slowly migrating our tracking code to this new paradigm, and we are excited to see whether this will help us to change the dynamic between product team and analytic team on tracking user behavior on our website.

--

--