Trading the Quill for the Press: The Case for Automated Tagging

Seth Stuck
FOX TECH
Published in
8 min readNov 20, 2023

Apropos to being a philosophy major working at a media and news organization (FOX), I’d like to start this article on tagging with a metaphor about the printing press…

In the mid-15th century, Johannes Gutenberg’s printing press revolutionized the dissemination of information. Within decades of its invention, the number of books in Europe skyrocketed from a few handwritten manuscripts to millions of printed volumes. The cost of books plummeted, and literacy spread rapidly, democratizing knowledge and triggering seismic shifts in society.

The printing press was not immediately embraced by all; critics lamented the perceived loss of quality control and artistry found in hand-crafted manuscripts. Nonetheless, the vast benefits of scale and accessibility were undeniable and the world was changed forever. The transformative power of the printing press lay not just in its ability to inexpensively replicate, but in its capacity to democratize knowledge.

I believe digital product teams are facing a similar sea change in the form of automated tagging.

MidJourney-created image of what one might imagine scribes looked like protesting the printing presses.

The Echoes of Historical Reluctance

History doesn’t repeat, it just rhymes… to paraphrase Mark Twain.

This is especially true when it comes to technological advancements. While there may be pockets of resistance (either explicitly on the merits, or implicitly as it relates to priorities or investments) when it comes to automation, I believe that — as with the printing press — the scalability of media products represents the perfect opportunity to innovate and evolve.

Modern Day Manuscripts: The Manual Tagging Conundrum

Take the tagging of behavioral events in digital products (websites, mobile apps, living room apps, etc.), for example. Tagging plays a pivotal role in measuring user engagement with features and content, which is a requirement of Product Research & Analytics efforts to provide valuable insights to Product teams seeking to enhance their users’ experience.

Just as scribes once painstakingly copied texts by hand, today’s developers (or, if you’re lucky, tagging specialists) often spend countless hours manually tagging websites and applications, ensuring that each click event and element of data is captured appropriately. This labor-intensive process, while thorough, is not scalable, especially for the large media brand Product teams under the FOX umbrella.

The Inefficiency of Manual Labor

In an era where speed, agility, and efficiency are crucial for success, sticking to manual processes is equivalent to insisting on handwriting books in Gutenberg’s time. The sheer volume of content and product changes produced daily necessitates a more efficient system. Moreover, manual tagging, like the scribe’s work, is susceptible to human error — missed tags, conflicting priorities, along with inconsistency in terminology and taxonomy. The sheer volume of updates can (and inevitably does) result in errors and gaps in the data.

Embracing the Printing Press of Our Time: Automation

Heap, a contemporary “printing press” in this analogy, has offered FOX a solution to this bottleneck. By automating the tagging process, we ensure all click events are instantly captured — no manual tagging necessary. Concerns regarding quality of this “firehose” of data, much like the concerns of the printing press naysayers, are far outweighed by the broader benefits of scalability, flexibility, and the freedom for our Product teams to focus on what they do best: building and enhancing products and product features for a more enjoyable, tailored and seamless consumer experience.

The Future is Automated

Just as the printing press revolutionized the dissemination of knowledge, automating our processes (starting with tagging) has the potential to revolutionize how we operate as a Product team. While it’s essential to understand traditional tagging methods — and to acknowledge that these more command-and-control methods may still well have a critical role to play for “source of truth” business reporting — it’s equally crucial to recognize that such methods are irreconcilable with agile, lean product development. The shift towards automation doesn’t negate the importance of quality; it simply redefines how we achieve sufficient quality within the context of a given use case.

So… how do we distinguish what tagging use cases are for product development and which are for business reporting? How do we reconcile, potentially, multiple tagging sources being on the same site/application/event at the same time? Doesn’t this fly in the face of our overarching desire to make this process simpler, cheaper, and faster?

In short: No.

While it might seem counterproductive to have multiple tagging systems in place, the objective and application of each system warrant its existence.

Distinguishing Tagging Use Cases

In our rapidly evolving digital landscape, the emergence of auto-tagging should be a watershed moment for Product Analytics. Meanwhile, traditional tagging methods still have a role to play for other analytics and data use cases. Auto-tagging, while transformative, isn’t a one-size-fits-all solution. Different use cases often call for tailored data sources and methodologies. It’s not about choosing one approach over the other or undermining a “single source of truth”: It’s about harnessing the right tool for the right task and not “judging a fish by its ability to climb a tree,” so to speak.

The four types of behavioral data, from a Product perspective, support one of the four following categories: (from L to R) BI Reporting, Product Optimization, Product Functionality, Other Analytics Capabilities. Source: FOX Tech

Types Of Behavioral Data Defined

  • BI Reporting: Business Intelligence relies on descriptive and aggregate data to produce reports that inform business understanding of trends over time using KPIs that remain fairly consistent. Tagging through such tools as Adobe and Segment for these use cases is common.
  • Product Optimization: The objective of Product Optimization is to quickly assess which designs or features can enhance user engagement and value generation. The metrics used are often ephemeral, aiming to isolate and enhance the impact of specific features — less a metric that would go into an executive report and more a gauge of specific click activity particular to a design treatment the Product team is testing. Speed and flexibility are the premium here. Heap and Datazoom are examples of auto-tagging tools that can thread this needle. The typical time frame for analyses or reports using these types of data is more event-specific, revolving around product launches or specific changes introduced.
  • Product Functionality: The focus here is to ensure that products deliver the desired functionality using behavioral data. For actual data products, it’s advisable to use first-party data, ensuring tight ownership and minimizing privacy risks. Examples of product features that would use these type of data include targeted ad delivery, content personalization, recommendations, and push notifications.
  • Other Analytics Capabilities: This is a wide umbrella covering various behavioral “tags” that power several business functions, including Marketing. The essence of these tags is to enable event-specific measurements and, in some cases, retargeting on different platforms. These might not be as regulated or exhaustive but serve crucial business purposes.

Focusing on Product Optimization & Analytics use Cases

Naturally, as a leader in the Product Research & Analytics space, my focus is on the Product Optimization use cases for behavioral data. And for these, I like to use another metaphor:

Product Analytics is more like polling than vote-counting

The goal of product analytics is to improve the confidence in/quality of Product decisions and actions. Like polling, we’re looking for signals to inform the “campaign” *before* the “vote.” Even when looking backwards to assess the impact of choices already made — the driving purpose is and should be to inform and ultimately improve future decisions before “the die is cast.”

“Vote-counting” is still very important! In this metaphor, it’s like executive and financial reporting. It’s obviously critical to get the count exactly right in these examples… but there’s a meaningful difference in these functions and they should not be conflated.

So, when it comes to tagging and data capture — the data have to be accurate and representative — and the focus is on being able to rapidly power differential insights. For example, users did metric X more in test variant B, or metric X increased Y% after we launched feature Z. For that, “polling data” more than suffices — and that’s what auto-captured clickstream data is in this case.

Will auto-tagged events perfectly match the count of manually tagged events from different systems? Probably never. But the variance is almost always explainable and rarely consequential when it comes to making relative product impact assessments.

Value-Add

Let’s assume you have no objection to the notion of auto-tagging, but don’t think the juice is worth the squeeze. Consider the following estimates that outline the estimated time investments involved in tagging things in-code, traditionally, vs auto-tagging:

Comparative outlines of the time it takes to tag events. Source: FOX Tech

By reducing the number of steps, time, and people required to support tagging, we realize an “opportunity gain” whereby development resources, who are otherwise mired down in traditional tag support work, can be freed up to focus on more value-additive feature development. Additionally, the flexibility and accessibility of auto-tagged data will ensure Product teams have fewer external dependencies and delays between them and the data they need to optimize their products.

Opportunity gain aside, a 64% reduction in tagging labor represents significant savings. Consider the following equation:

(Total Tags) x (Labor to Support Each Tag) x (Hourly Labor Cost) = (Tagging Overhead)

1,000 tags x 12.5 hours per tag x $145* per hour = $1,812,500

1,000 tags x 4.5 hours per tag x $145 per hour = $652,500

$1,812,500 — $652,500 = $1,160,000 savings per 1,000 tags created

*Est. pro-rated avg. blended (Dev + all else) rate for Tagging stories

The exponential impact of these savings is immense for an enterprise that has multiple brands that have multiple digital platforms with 50–200 tags each, such as FOX Tech.

Moving from the Manual Tagging Dark Ages into Product Analytics Enlightenment

Just as the Printing Press democratized information and learning (to such an extent that it served as a catalyst for the Renaissance and enlightenment), so too can unchaining product analytics insights from laborious manual tagging — leading to scaled, agile, democratized Product team learnings and innovation.

So, the choice is clear: cling to our quills or embrace the press. We at FOX Tech believe in the power of the press!

ABOUT FOX TECH

Make Your Mark Here.

At FOX, we pride ourselves in shaking things up and making things happen. We’re a community of builders, operators and innovators and each and every day we experiment, collaborate, and co-create to develop the next world of news, sports & entertainment streaming technology.

While being one of the most well-known brands in the world, we provide our employees with the culture of a start-up — fast paced, non-hierarchical, full of smart ideas & innovation and most importantly, the knowledge that each member of the team is making a difference in defining what’s next for FOX Tech. Simply put, we love to do great work, with great people.

Learn more and join our team: https://www.foxcareers.com/

--

--