Data-Driven Legal Intelligence: Doctrine’s Analytics Milestones

Anais Limpalaer
Inside Doctrine
Published in
7 min readJan 2, 2024

Why do we collect product usage data at Doctrine?

Providing legal intelligence demands more than just innovation; it requires a nuanced understanding of user behavior, system performance, and the impact of our product. Product analytics tools play a strategic role in advancing our mission. At Doctrine, listening to our customers is one of our core values. Combined with conducting customer interviews, delving into the product’s data is an important way of listening to our users.

In this blog post, we’ll dive into the technical intricacies of our GDPR-compliant analytics journey at Doctrine. After examining our event format specification, we’ll talk about three key iterations we have gone through to enhance and stabilise our product usage data processing.

First, we’ll guide you through the initial phase where a centralized tracking function was key to our data collection. Next, we’ll delve into a more refined approach with the adoption of Typescript and domain-specific tracking functions, enhancing both precision and scalability. Finally, we’ll unveil our latest iteration, in which a Tracking class ensures that all emitted events adhere to a strict set of rules.

Formatting specification

In practice, user interactions with doctrine.fr are sent to third-party tools used by Product teams at Doctrine on a daily basis. This data is pseudonymised, meaning one cannot directly identify users from the collected data: we do not send any directly identifying information and use HTTPS/TLS v1.2 encryption.

They offer valuable insights into how our users engage with our product, for example:

  • Among visitors of our Legislation pages, how many are coming from the Search page? How many are coming from an external source like Google or LinkedIn?
  • Which actions are performed most often on a Decision page?
  • Has my new feature increased clicks on summary items on Enterprise pages?

At Doctrine, we present product usage data in the form of events. Following the popular object-action framework, each event at Doctrine is the combination of:

  • a domain — a type of page on our website (for example Decision, Enterprise or Search);
  • an action — how the user interacted with that domain (for example SummaryItemClicked);

Each event emitted by doctrine.fr must have:

  • a name, which is simply the domain and the action put together with a space character in-between (e.g. Decision SummaryItemClicked).
  • some properties that give more context around the performed action, like the decision’s ID or the type of summary item that was clicked.
{
eventName: '**Decision** **SummaryItemClicked**',
properties: {
decisionId: 'DECISION_123',
summaryItemTitle: 'Motifs de la décision'
}
}

As of October 2023, we track a total 474 unique event types, and Product Managers have created hundreds of dashboards such as this one, showing the most frequent titles for summary item clicks on Decision pages:

Iteration 1: Centralized tracking function

The official way to collect product usage data at Doctrine from the beginning up until 2022 was to call a track function with the name of the event and the properties as arguments :

// DecisionPageSummary.tsx

const onItemClick = (itemTitle) => {
track('Decision SummaryItemClicked', {
decisionId: 'DECISION_123'
summaryItemTitle: itemTitle,
pageName: 'Decision',
});
}

The event’s name is then validated server-side against a whitelist, and properties are sanitized.

👍 Pros:

  • Straightforward. This method was effective for some time, and did not require much work, but it became less effective as we added more and more domains.
  • Tracking logic is abstracted and centralized.

👎 Cons:

  • Implicit naming conventions. The convention of “Domain + Action” can be understood from the rest of the codebase, but it is not enforced.
  • No type-checking. I could remove some (or all) of the properties or introduce a typo in the event name and nothing would happen… until product managers would start noticing discrepancies in their dashboards.

This first iteration has caused many irregularities. Occasionally, event names would unexpectedly change following the introduction of a typo in the codebase, or certain properties might disappear after a new release, among other issues. This has prompted us to seek a more effective solution.

Iteration 2: Typescript to the rescue

For this second iteration, we wanted to introduce shared properties for all events of a domain, and type-checking for properties of an event.

The track function has not changed in this iteration. But it has been wrapped with a domain-scoped function for each domain. This domain-scoped function does two things:

  • it adds typings for properties of each event
  • default properties are added to all events emitted on a given page (like the ID of the page’s decision, etc.)
// DecisionContext.tsx
import track from 'utils/tracking';

type DecisionEvent = {
eventName: 'Decision ButtonClicked',
properties: { ... }
} | {
eventName: 'Decision SummaryItemClicked',
properties: {
summaryItemId: string;
}
}

const decisionAnalytics = ({ eventName, properties }: DecisionEvent) => {
track(eventName, {
decisionId: 'DECISION_123', // Domain-specific properties
pageName: 'Decision',
...properties, // Event-specific properties
})
}
// DecisionPageSummary.tsx

const { decisionAnalytics } = useDecisionContext();

const onItemClick = (itemTitle: string) => {
decisionAnalytics('Decision SummaryItemClicked', {
summaryItemTitle: itemTitle,
});
}

👍 Pros:

  • Type-checking for event properties
  • Domain-wide properties are automatically added for all events

👎 Cons:

  • Implicit naming conventions. The convention of “Domain + Action” can be understood from the rest of the codebase, but it is not explicitly documented.
  • Cumbersome type listing. The list of all possible event types can get harder to maintain with time, as more and more events get added.
  • Inconsistent action naming across domains. Most actions — like clicking on a summary item — exist on multiple domains. Ideally, these actions would bear the same action name and properties across domains.

While our approach has significantly reduced irregularities, we’ve observed that similar actions across various domains, such as clicking a summary item, have been associated with different event names and properties. This made it hard for our product managers to perform cross-domain analysis, and lead us to work on our final iteration.

Iteration 3: The Matrix

This final tracking implementation considers that events at Doctrine live under the same “matrix”. This means that all events are the result of combining a domain and an action.

Event properties are the result of global domain properties, and the shared action properties merged together.

Let’s break down the code:

First, we have a generic Tracking class. This is where we define all of our actions.

class Tracking {
constructor(
public domainName: string, // Domain Name ('Decision', 'Enterprise', etc.)
protected domainProperties: P // Global domain properties (decisionId, etc.)
) {}

// Centralized tracking logic.
protected track(
action: string,
actionProperties?: <Record<string, unknown>,
) {
sendEventToServer({
eventName: `${domainName} ${action}` // Event name is merged,
data: {
...actionProperties,
...domainProperties, // Data is merged
},
});
}

// Tracking method names are the action names in camelCase
public summaryItemClicked(
// Required action properties are declared here
actionProperties: { summaryItemTitle:string }
) {
this.track('SummaryItemClicked', actionProperties);
}

// Some events require no additional properties
public searchInputFocused() {
this.track('SearchInputFocused');
}
}

We can then initialize a domain tracking class like so:

// DecisionPage.tsx

const decisionTracking = new Tracking(
'Decision',
{ decisionId: 'DECISION_123' }
);

/**
* Sends this to the server:
* {
* eventName: 'Decision SummaryItemClicked',
* data: { decisionId: 'DECISION_123', summaryItemTitle: 'Motifs de la décision' },
* }
*/
decisionTracking.summaryItemClicked({
summaryItemTitle: 'Motifs de la décision'
});

👍 Pros:

  • Standardization: events across 30+ domains are now standardized. It is much easier to conduct product analysis across domains.
  • No more typos: event names are no longer hardcoded when we add tracking to a feature.
  • Enhanced data accuracy: With the use of strict conventions and type-checking, the chances of incorrect data collection have been significantly reduced, ensuring the accuracy of the collected product usage data.
  • IDE autocompletion: tracking actions is now easier thanks to automatic code completion.
  • Shared components can abstract event tracking: instead of having an onClickCallback prop on the <SummaryItem/> component, we can now pass the domain tracking class to the component, and the component can call tracking.summaryItemClicked(). This is a great way to make sure that the action is tracked on every page.
  • Improved scalability: The new tracking implementation allows for easier addition of new domains and actions: for every new domain, an extensive collection of actions is already available for use. Likewise, every new action is instantly made available for all domains.
  • Versatile system: we have been able to handle a number of edge cases by improving or extending the Tracking class: legacy event names with a different naming convention, specific events that can not be shared across domains, overlapping between action and domain properties.

👎 Cons:

  • Page-based implementation: this tracking system works best for page-based domains (Decision, Enterprise, etc.), for which the tracking class is initialized once and used multiple times. For application-wide domains, like User for example, we would need a separate tracking instance on the page.

Although the initial implementation cost is higher, the versatility and framework-agnostic nature of the system have proven beneficial.

One of the core values of Doctrine is “Release early, release often, and listen to your customers.” This extends to our product analytics framework as well. Going forward, the tracking implementation will continue to evolve to meet the needs of Doctrine and its users, as it expands and reaches a more global audience.

--

--