Scaling Plum Guide— How we handle tracking and analytics

Tiago Morais
Plum Guide
Published in
5 min readJun 18, 2019

--

The heart of good decision making is information.

At The Plum Guide, data is a big driver for what we do. Understanding our users is paramount to iterate and improve our product. It’s also extremely important for marketing and growth. So gathering accurate data from our users is critical to both the business and the product.

Is it not uncommon for a company to use multiple analytics tools across several channels. Google Analytics, KissMetrics, Indicative are some of the tools that we use to measure and get valuable insights into user behaviour, funnels, and conversions.

While many expensive off-the-shelf tools promise to solve these problems even without any custom code, the reality is, that for accurate and tailored analytics the best approach we found was having granular control over how, where and which user interactions are recorded.

As we add more features, the number of user interactions to record increases. Combine this with more code and more engineers working on it, it can be difficult to maintain and retain consistency of which events are fired when and where.

As our team scales, the way we handle analytics in our codebase also has to scale.

🐣 Analytics — The early days

The quick and manual approach

In the early days, third party scripts were simply added to the codebase, and the events were tracked both automatically and programmatically for each individual channel, everywhere in the code, where required.

With time these become littered in the code, lead to performance issues, became hard to manage, and added global third party script dependencies throughout the entire code base, making the code harder to test, and prone to errors and exceptions.

🐥 Abstracting event tracking

A better, but still manual, approach

The first step we took was to add an abstraction layer for our analytics that could serve any third party vendor we wanted.

We started by simply proxying every tracking call into our own module. Then the module would call the third party vendor calls as required. This made sure we kept all our analytics code isolated in a single module, which is easily mockable for testing.

This is how the setup looked, roughly

To use it, all we had to do was:

This small abstraction, coupled with Typescript, allowed us to easily find where and how many times each event was used and which channel it was sent to.

Abstracting analytics was a good start but it still required a lot of developer time to maintain, and with lots of input from marketing and product teams.

But this did not solve all of our problems. The bottleneck moved from managing the code to managing vendors scripts.

As we scale away from a monolith project into several smaller projects, it became increasingly harder to manage multiple analytics third parties across several projects and environments, even when sharing common analytics code/modules.

Look at it, so tidy!

🐓 Abstracting vendors

Like before. But better.

Even with code abstraction in place, it still was harder and harder to manage third party scripts, especially with a growing team, working on separate repos.

Luckily, there are tools that help greatly with this problem. For us, this tool was Google Tag Manager (GTM).

GTM allows developers to simply push events to GTM’s dataLayer, and to manage third party script separately from the codebase.
The only third party script we installed in our code was GTM’s. Then from GTM’s dashboard, we configure the rest.

Similar to our code-level analytics abstraction, GTM acts as a proxy receiving our events in their dataLayer and sending them to whatever channels are configure by us.
This allowed us to manage each tag(third-party channel) at will, in staging or production without changing any code at all. GTM handles all that for us.

Hand drawn for more realism

🦅 A proactive, declarative approach

We got wings now

The next step in improving how we do analytics was to move from an imperative reactive approach to a more proactive, declarative one.

Instead of requiring a developer to add or change specific source files with calls to the analytics module when analytics requirements changed, we inverted the flow.

When a new feature is added, the developers are responsible for adding events to any major user or programmatic action that occurs across the feature they are working on. This is usually done with the help of the Product Owner.

We do this in two ways:

Auto tracking with data-attributes

By adding data-attributes to our code, we can configure GTM once, to pick these up and push them to the dataLayer itself.

This setup required a little configuration on GTM, but it’s only setup once and works for every project.

Programatic with GTMs data-layer

GTM will pick these up and fire the appropriate tags, which in turn will send the events to the third party channels.

The major benefit of this more proactive model is that events were setup by default, added at development time and if any stakeholder wants to track them, they can just configure GTM to do so.

This model also helped increase developer ownership and made our team more conscious and aware of the importance of user behaviour flows.

In the early days of this approach, developers would often forget to add these app-wide events, and would have to add them in when a new analytics requirement came. But with time, they become adept at adding all the necessary events by default.

Each time we develop a feature, we think about tracking and measuring as a first class citizen so our specs include our KPI metrics as well as the events needed to realise them.

🚀 We’re just getting started

Plum recently closed its Series B round, which is very exciting news for all of us. We’re continuing to grow fast and and so is our engineering team.

We’re working on very interesting problems, and data driven decisions are at the core of what we want do to.

If this is something you find interesting, we’re hiring!

--

--

Tiago Morais
Plum Guide

Tech lead for @plumguide. Software that scales, for people. Mainly in reactjs and .netcore.