A Marketer’s Guide to Data: The Six Forms

Dr. Jason Davis
Simon Data
Published in
5 min readNov 29, 2016

Marketers increasingly rely on data to guide critical decisions at every level. Testing messaging, evaluating ad spend, identifying the highest-intent moments in customer lifecycle, and everything in between ultimately depend on your ability to synthesize, reason about, and act upon various forms of data. But what are those forms, and how do they affect marketers’ jobs? Here’s what you need to know. . .

The forms of data available today are more varied than ever: big data, small data, database data, analytics data, spreadsheets — you get the idea. Each of these forms has its own strengths, weaknesses, and eccentricities that impact how they’re best used in a marketing context.

The six types of data most relevant to marketers are:

1. Javascript & analytics data. If you’ve ever used Google Analytics, you’ve used data that’s been collected via Javascript. Analytics data is pervasive on the web for good reason — it’s relatively easy to implement and monitor, and it can scale to billions or more events per month.

However, analytics data is a write-once medium, meaning that once you view a webpage, search for an item on Amazon, or buy something from Etsy, the data recorded at that moment and can’t be rewritten or modified.

So why does this matter? Imagine you bought a pair of shoes online that didn’t fit. You send an email to customer support, they ship you a return box, and then you make the return. The process is reasonably straightforward on your end, but none of these activities can be easily recorded via javascript tags, which potentially leads to data gaps.

Analytics data suffers from the following shortcomings:

  • Limited in collection: Its collection capacity is largely limited to the web and mobile devices, meaning offline and/or customer support workflows are hard to capture.
  • Limited in measurement: Analytics data by nature is a time-oriented medium. It doesn’t lend itself well to data that’s not timestamped, such as your gender, your address, or your favorite color.
  • Limited accuracy: Analytics data being write-once, It’s difficult to make necessary changes. Revenue recognized from shoes shouldn’t include returned items, but analytics data typically can’t track returns.

2. Database data. Most internet businesses use their database as the authoritative source of truth. This means that the database contains the most accurate data for all parts of the business.

Databases come in various forms. Transactional databases (called OLTP databases) are used to run your live website and store data as customers buy things. Analytical databases (OLAP databases) aggregate and transform this data for the express purpose of analysis.

Mature organizations will use their OLAP database to drive business insight tools such as Tableau. These systems often require significant resources to correctly transform data from production systems into forms that can be queried for analysis. For example, which transactions should reflect recognized revenue?

While database data is highly accurate, it is limited in the types of data that most modern databases can handle. Even larger scale OLAP databases such as Amazon Redshift, Vertica, or Teradata can’t scale to petabyte data. They work best at medium scale and have limited support for unstructured data such as text or images.

3. Hadoop & unstructured data. Whenever you hear about Hadoop, just think “big data”. Hadoop is a data storage and processing environment that enables analysis of large volumes of data. Larger organizations often have Hadoop clusters that have thousands or even tens of thousands of machines running them.

Hadoop is a great place to store “all your data”. This includes customer product reviews, article comments, user submitted images, and more. These unstructured sources can often hold treasure troves of insights and trends.

The downside of Hadoop is that it’s hard to use. While there have been some recent technology advances, Hadoop remains primarily a tool for data engineers or a holding pen for data before it goes into an analytical database.

4. Siloed data. As the SaaS world evolves, increasing numbers of internet companies rely on third party applications such as Salesforce, Zendesk, or Survey Monkey to collect and hold core customer data. Implementing these services is usually easy, and many companies incorporate them deeply into various functions.

The data these services hold is often siloed. It’s not easy to compare Zendesk customer complaints to revenue and customer lifetime value. Nor is it easy to break down customer satisfaction scores submitted by Survey Monkey alongside product usage data.

While these discrete data sources can be useful to marketers on an individual basis, the goal with siloed data should almost always be to un-silo it by joining it into the rest of your core data. You’ll gain significantly better visibility into your customers, which is what every savvy marketer needs.

5. Excel data. Excel is both a fantastic and frustrating tool for data analysis. It’s used extensively by CFOs, analysts, marketers, and pretty much anyone else in a role that requires so-called “last-mile” analysis.

Excel’s major drawback comes from the challenge of getting data out of it for use in other tools — not to mention the difficulty of trying plug Excel files into live data. State-of-the-art here is generally saving-as a CSV file and then importing it by hand into its final destination. As most end-marketers know, this is brutally inefficient.

6. Missing data. Data doesn’t cease to exist just because you’re not effectively capturing it. If you haven’t explicitly configured your web analytics to capture a button click on your website, those button clicks will probably be lost forever. Similarly, most brick-and-mortar shops don’t count the number of visitors who enter their store on a daily basis.

Still, these button clicks and in-person customers are quite real, and they have equally real implications for the businesses in question. While capturing 100% of all customer or company data is impossible, vigilance is what matters here. Maintaining awareness of your systems’ limitations equips you with tools to identify and fix the gaps in your data. That way you can steadily improve the foundations on which you make business decisions.

Key take-aways:

In a marketing context, understanding customer behaviors across product usage, customer support, and top level revenue generated for the business can be essential for coordinating email and other retention efforts. And since most marketers don’t have the luxury of a 24/7 on-call data scientist, it’s important for them to learn about these data types and their limitations.

All too often, people reason about critical business decisions using only the data that’s easily available, but that seldom gives them the entire picture. Arriving at — and acting on — the correct decision requires knowledge of all relevant data collected by your business, not to mention finding ways to close gaps with data that can also potentially be unearthed.

--

--