Product usage analytics at Pipedrive

Kadri Lenk
Pipedrive R&D Blog
Published in
10 min readNov 23, 2021

How Pipedrive set up product analytics and usage tracking process

Intro

As a product data analyst, I work closely together with PMs (product managers) to help them make better decisions. My daily tasks revolve around digging into product usage data to answer questions on user behavior and suggesting new opportunities for growth and improvement.

Last year, our team of analysts faced a challenge as the number of PMs was growing fast, leading to a disproportionate ratio between analysts and PMs. This created a situation where analysts only had time to collect usage data and answer PMs’ ad-hoc questions rather than deep dive into data and do predictive analytics, something we had been meaning to do (and are also very good at).

In this article, we’ll take a look at how we have set up Pipedrive’s product analytics and usage tracking process so that all PMs’ data needs are covered, allowing analysts to take on more challenging and high-impact tasks.

What is product usage analytics?

First, let’s talk about what product usage analytics means.

Product usage analytics is the process of analyzing product data to understand how customers use your product. It gives you insights into how customers really behave when using your product and why they take the specific actions they do.

Some key questions you can answer with product usage analytics include:

  • What’s the overall product performance? How many active users do I have?
  • What’s the user retention rate, and what drives it?
  • Which setup and activation funnels work best? In which steps do users drop off the most?
  • How engaged are the users? How do power users differentiate from the rest?

The ultimate goal of product usage analytics is to drive effective, evidence-based product decisions.

Of course, product usage data on its own is not enough to get a full picture of how the product is performing. Combining user behavior knowledge with financials, customer feedback and other insights is essential. Nevertheless, you cannot expect to have a high-performing product without knowing how it’s being used.

Product tracking

Before analyzing any product data, you must first collect it. This is where product tracking (also called product instrumentation) comes into play. Product tracking means adding a small piece of tracking code to your product, which logs specific events that users trigger. Once the event occurs, the tracking code captures the data of the event and the characteristics of the user who triggered it.

Explicit vs. implicit tracking

At Pipedrive, we use the explicit tracking approach.

Explicit tracking occurs when you manually define and implement the events you’d like to track. For example, if you develop a new feature and want to capture its usage data, engineers need to add the tracking code to collect it. Though explicit tracking does take more time, it holds a clear advantage by allowing you to know exactly what’s being tracked, making analyzing and governing your data much easier.

Implicit tracking, on the other hand, occurs when all user interactions within your application are tracked automatically. The engineers need to set up the system once, so you don’t need a tracking plan for new events. Though this tracking process takes less time, it comes at the cost of messy data during the analysis phase since you have less control over what’s being tracked and how.

Pipedrive’s process

Product data analysts used to be responsible for the product tracking process. As such, we would need to think about what to track, how to do it, validate the implementation and create reports and dashboards. As the number of PMs who asked us to implement product tracking kept growing, we realized we had to change the process as it was not scalable and left us unhappy.

Below is a diagram that illustrates Pipedrive’s current product tracking process, along with its main roles and responsibilities.

Product usage tracking process at Pipedrive

As you can see in the diagram, most of the heavy lifting is now done by the PM since they own both the feature and its data. As they are also responsible for measuring the success of their features, being closely involved in the usage tracking process enables them to know what data is being collected and how to analyze it later.

Analysts mostly act as consultants during this process by supporting the PM when necessary. They will also help measure success when more advanced analytics are needed. The engineer’s main responsibility is to implement and deploy the tracking solution.

Let’s look closer into each of the steps:

1. Define success metrics

The first question the PM should always ask themselves is “How do I define success?” Every new feature, however big or small, should solve some problems and provide value for the user. Hence, in this first step, the PM defines:

  • Key metrics to be measured to understand if the product development has been successful
  • Key steps in the feature/product setup and activation funnel
  • Anything else to be learned

The analyst’s role in this step is to give guidance (when necessary) on setting up the right metrics and targets and getting baseline numbers.

2. Define tracking solution

After figuring out the success metrics, it’s time to ask the next question: “What data do we need to measure success?” The answer to this question is crucial to defining the tracking plan, including:

  • All the new tracking events and their properties
  • Any changes in existing events
  • Any new user properties

Sometimes, and in order to measure success, PMs may require some data other than tracking, as tracking only gives data on actions performed, not on the overall state. For instance, a success metric, “X% of users have discovered the Y feature” is easily measurable with tracking data since it shows which users have triggered the discovery event. But if a metric is something like “X% of all customers have two out of four features activated at any given time,” you may need to combine the activation and deactivation events to get the current state. In which case, it’s better to store the current state in a table in the data warehouse to measure that metric.

3. Implementation, validation and deploy

This step is pretty similar to the development process. An engineer implements the tracking as described in the plan and deploys it (learn more on Pipedrive’s deployment process). The PM’s role is to ensure that the implementation has been completed successfully and report back to the engineer if anything needs adjusting. Once everything is set, the PM updates the tracking data taxonomy (see taxonomy section below).

4. Measure success

Once the tracking plan has been implemented, and the data is pouring into the product analytics tool, it’s time to measure success!

At Pipedrive, we use Amplitude as our product analytics tool. It’s built specifically for product usage analysis and makes it much easier to understand user behavior compared with traditional BI tools like Tableau or PowerBI. Using Amplitude doesn’t require advanced analytical skills, so every PM, regardless of their level of data literacy, can do their own analysis and measure success without depending on analysts.

Of course, Amplitude is not supreme and cannot measure all metrics. This is where the analyst steps in to help the PM. Depending on what’s required, the analyst can delve deeper into data, combine event data with other data sources kept in the data warehouse (financials, for example) or set up a dashboard outside Amplitude (in Tableau, for example).

Data flow

Below is a diagram that illustrates the flow of tracking data from Pipedrive’s product (both web and mobile apps) to Amplitude, our product analytics tool.

Tracking data flow from product to analytics tool

When either of the sources triggers an event, it is sent to Segment, which routes data to the prescribed destinations. The most important destination in the context of product usage analytics is Amplitude, but we can easily route data to other tools. For example, for marketing or customer communications purposes. Finally, we send the tracking data to our data warehouse so that we wouldn’t have to rely on Amplitude to access the tracking data and could use it in different analyses, from combining it with business metrics to predicting customer churn.

Governance

Data collection can be easy (as it is for us at Pipedrive), but maintaining its quality can be a challenge. Here are some key practices we follow at Pipedrive to overcome them.

Naming convention

Events are planned and implemented by many people, meaning things can get out of hand pretty quickly. A PM who designed an event named tooltip.opened may remember where the event is triggered, but normally nobody else would. Things get even more complicated when a different PM adds another tooltip.

The main goal of the event naming convention is to make the tracking data discoverable and understandable by everyone. Event names should be clear and descriptive and denote what happens when the event is triggered. Otherwise, you can’t tell which event corresponds with which user action.

At Pipedrive, we use object.action syntax. For example, deal.added, email.sent, activity.marked_done, activity_list.opened. At the end of the day, it doesn’t matter what the specific naming convention is. You could be using space to separate the words or have the entire event name in camel case. The main idea is to have rules and follow them.

A crucial aspect to keep in mind when coming up with event names is to always think from the user experience perspective. In other words, what the user will experience when taking a particular action in Pipedrive. For example, when a user clicks the “Save” button on the “add new deal” dialogue, their experience is probably that of adding a new deal to their pipeline. I recommend avoiding technical and interface-specific terms like modal, popup, component, header, etc. Not only because these words are not very descriptive but also as the user interface can (and will) change over time, and a feature that uses modals today may not do so in the future.

Taxonomy

A well-organized data taxonomy is a prerequisite for insightful analyses. As such, plenty of data analytics tools provide such a service. At Pipedrive, we keep the taxonomy in Amplitude in combination with Segment.

At the very minimum, taxonomy should include the names of all events, their properties and user properties. You can significantly improve those by adding descriptions for all the events and properties. To get the most out of your data, try to define the data type (string, integer, boolean, etc.) of each property, provide the definitive list of expected values (if applicable) and mark whether the property is mandatory (that is, if it must be present at the event).

Example of an event described in the taxonomy

Keeping your taxonomy up to date allows you to track down any data quality issues. You can easily detect unplanned events that flow into your analytics tool, properties that don’t match the rules you’ve determined (wrong data type, a mandatory property is missing, etc.) and more. This should give you a good starting point, from which you can continue investigating the root cause of quality issues.

Example of an unplanned event in the taxonomy

One common best practice for tracking new events is describing them in the taxonomy before they are implemented and triggered. This way, you won’t get false alerts during your quality checks and can compare incoming data to the tracking plan.

Monitoring

Every now and then, a tracking event would go off the rails, whether due to code refactorings, feature updates or various dependencies. These factors could all negatively impact event tracking. To detect such bugs as soon as possible, it’s best to set up a monitoring system for event volumes, which would trigger alerts when anomalies occur.

At Pipedrive, we rely on Amplitude for anomaly detection. Amplitude’s technique is built on top of an open-source, time series forecasting tool called Prophet. Prophet works best with time series that have strong seasonal effects and several seasons of historical data. It’s also good at handling missing data, shifts in trends and outliers.

Below is an example of an anomaly where the actual data volume hasn’t met the forecasted interval for the past four days. Based on this example, we can determine whether any action is required. Could the drop in volume be a result of sunsetting the feature that triggers the event? Or maybe the drop points to something else, which requires our investigation?

Example of an event volume anomaly

Conclusions

The process described in this article has been in use by Pipedrive for about a year.

Naturally, it took us quite a lot of effort to shift the main responsibility of the process from analysts to PMs. This process has been challenging as people are very different from one another. While some PMs quickly picked up the new process, some needed more support, and some still do.

Overall, I believe this situation is a win-win. The PMs have become much more independent as they can answer a lot of their data questions themselves. At the same time, we, as analysts, have more time to work on projects where we can tap into our full analytics potential and show off what we are really capable of doing.

Interested in working in Pipedrive?

We’re currently hiring for several different positions in several different countries/cities.

Take a look and see if something suits you

Positions include:

  • Junior Developer
  • Senior Business Analyst
  • Software Engineer in DevOps Tooling
  • Infrastructure Engineer
  • And several more

--

--