The evolution of a data-driven startup and the FOMO tracking

Image for post
Image for post
Not all these trackings will be useful. But how can you know that in advance?

Finding out when is the time (and not anymore) to rely on Google Analytics® for basic business metrics can be a maturity challenge for your company.

Notice: this is what I’ve learned in the past years, and it does not mean that it is the only way (neither the best) to solve this problem, but this is how I did it. I've been using Google Analytics for many years and still do it, depending on business needs, timeline, and complexity. This is an awesome tool with great simplicity, but also very powerful for early-stage companies — using the free version.

ecently, I’ve been asked for a couple of entrepreneurs on how they could manage to build their startups as data-driven companies since day one.

More specifically, they wanted to know everything about what tools they should use and which frameworks they should implement to make sure “everything gets tracked the right way.” Have you ever felt this way?

So I had to get back with a straightforward question: what is the problem you are trying to solve right now that you believe that “everything tracked” would be the solution?

The answer may vary a bit, but it was not related to “increase X” or provide a better solution for “Y.” Going deeper with some other questions, it became clear to me that I was facing a new kind of FOMO — fear of missing out data, this time related to missing out the metrics (“regardless of what they are, hopefully, I will discover it in the future”). This would be the first (and simplest) approach to the common question of "what success metrics should I be tracking in my product?".

That said, I hope that especially for those who are still beginning their journeys in a new product or business, these words can give you some relief: it is ok not to be perfect since the beginning. Especially in the data field, that there’s always room for incremental improvements.

That’s enough theory, let’s jump into something way more practical: can I start with a bunch of Google Analytics trackings in my site (or Firebase analytics, in the case of mobile applications)? I would say yes, as long as it is a conscious decision (considering pros and cons), and there is a long-term plan to address future issues (to be solved in the future, not now, just to reinforce here to solve one problem at a time).

In practical terms, I would say to focus on these three aspects (some of them I could experience myself, and some others, I had to learn the hard way):

  • Organize your tracking using a Google Tag Manager container. This way, you will make sure that all your tracking codebase is centralized, making it easier to reproduce it in another tool.
  • Define a simple naming convention and make sure everyone will follow it — see an example below. This will ensure that, while you're scaling your tracking, you will (hopefully) make it simple to new folks to onboard.
  • Keep somewhere (a spreadsheet should work) two lists (maybe different tabs of the same spreadsheet): (1) a list of answered and unanswered questions — that you believe data tracking may help you address it and (2) a list of all tracking terms, aside with its business definitions and other relevant information (example: exclusive to web version).

For a realistic view of a “simple” naming convention, I would use this representation and a 5 thumb rule (a few of them I've learned after starting using schema registry from Snowplow Analytics®):

       button_click Vs. userClickedButton Vs. clicked_button
  1. Smallcaps characters only;

2. No special characters (like accents), and always in the same language (English is recommended);

3. Use underline if a space character is needed;

4. Always use the noun before the verb, always use the same verb tense (past tense may be recommended for some use cases);

5. Avoid repeated terms (like beginning everything with user_*).

With these quick tips, you can make the following plan:

When you understand it is the time to tune up your analytics solutions, you will already have two critical bases of your data engine: (1) a unified tracking management and (2) an essential data catalog. When things get bigger, it will be clear why it is always better to keep it right simple.

In a practical way, you will be able to abstract and move all your tracking logic layer (considering both web and mobile clients) to the stack of choice, and also elaborate a data repository to be used and update by all teams. A simple template that could guide your documentation would be creating the following tabs in a spreadsheet:

  • Getting started: presenting the information class, the objective of the document, how it works (in general terms) and who is responsible for (probably the DPO contact);
  • Glossary: most used terms and definitions (considering platform-specific issues, such as the session definition, that may vary considering many factors)
  • Naming conventions: feel free to start with the ones I've suggested in the previous section, including examples whenever is possible
  • Data sources: for each tracking platform (for instance, Google Analytics) a full description of the parameters, attributes, type of data, business meaning, source, and a column stating whether is it deprecated or not.
Image for post
Image for post
The framework suggested for an initial version of a data catalog

Once you evolve your analytics to another stack (in my experience, going to Snowplow Analytics®), it may be provided more robust solutions for this problem of data accessibility (either in their own panel, such as in the Snowplow Data Structures), or you can look for some other tools to help to address this issue (such as Apache Atlas and/or AWS Lake Formation®).

On the other hand, if you don’t follow those simple three rules, keep in mind the following pitfalls may occur:

1. Taxonomy: Absence of catalog and lack uniformity of nomenclature;

2. Duplication of markings of the same information (with the divergence of knowledge) in the corresponding tracking tool and different means.

3. Absence of essential attributes (that may change over time) in the event tracking.

In the next post, I will talk more about transitioning from a simple Google Analytics solution to something more robust (either keeping GA or not) — when it is and when it is not the time. In the meanwhile, feel free to share your own experiences on this subject. Keep tuned for more information and happy coding :)

Written by

Highly enthusiastic and extremely motivated because I love what I do.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store