A mental model for segmenting 1P, 2P, 3P and external data

Ashish Singal
3 min readJan 4, 2022

--

There are a lot of frames of reference when it comes to data and terminologies being used. The data stack, and data terms, are evolving extremely quickly so it’s sometimes hard to keep up.

Today, for most companies and most industries, firms are focused primarily on their own data. Historically, this is been 1P data — data that they’ve collected within their own systems and databases.

Increasingly, as we’ve seen the explosion of SaaS tools and, in parallel, the cloud native data warehouse and modern data stack, firms have started pulling in their data from 3P tools. I call this 2nd party data — because this data is the firms data, but it resides in an external system.

Finally, there’s true 3P data — data that is not specific to the firm ingesting it. Traditionally, some industries, especially capital markets, has relied almost primarily on 3P external data, but more firms in other industries are adopting 3P data strategies as well.

I’ve done a lot of thinking about the delineation and segmentation of data according to the distance to the data consuming firm to frame my own mental model.

Here’s my take —

1P vs 2P vs 3P

1st party — this is data that a company collects and stores in their own data systems. Take, for example, a company that develops a website that has a database with a users table.

2nd party — this is data that is specific to a company and generated by the company’s activities, but hosted in a 3P service, most often a by a SaaS company. Take, for example, a company’s Salesforce instance or Google Ads data.

3rd party —also called “external data.” This is data that is external to the company but still used in the company’s activities. Take, for example, stock market data or weather data.

The key difference between 2nd party and 3rd party data is whether the values in the cells are the same for each company. For Google Ads data, the data that company A will get is different than the data company B will get, so it is 2P data. For stock market data, the data is the same no matter who you are, so it is 3P data.

In most discussions about data at companies, the focus is on 1P and 2P data above, because this is what drives most insights that companies care about. In the next 5 years, though, I think that 3P, external data, will play an increasingly important role.

Types of 3P / external data

Let’s dive a bit deeper into 3P data above and segment it further.

Partner data — data that is shared privately by partners of the firm. For example, a brokerage may collect listings from partner brokerages.

Commercial data — data that is commercially available and bought and sold. For example, a firm purchasing stock market data feeds from an exchange.

Alternative data deserves a mention here. The word is used in the investment vertical to denote data used for investment decisions that falls outside of the traditional pricing, reference and fundamental data that has traditionally been purchased. It is usually commercial in nature, but not always. For example, a credit card processor selling (anonymized) transaction data to hedge funds.

Open (or public) data — data that is in the public domain. For example, data from the Census Bureau or the SEC.

There’s a fourth type of data here that is going to soon require analysis as a completely separate entity —

Decentralized data —data that is completely decentralized and produced by decentralized applications. Available in the public domain but without a central authority on the data. For blockchains, also called “on chain” data. For example, Bitcoin blockchain transaction data.

I’m a strong believer that 3P data in general, and most specifically, decentralized data, will significantly grow in importance over the next decade.

--

--