How to Monetize Data Networks: Focus on Acquisition and Usage
Data networks can use five out of six possible monetization models, depending on the way they acquire data from users and the way that data is consumed
Data networks are unique within the world of network effects. Most network types create value by allowing participants to interact with each other in some way. Data networks, however, do not connect participants directly. Instead, they crowdsource data from participants to improve the product for all of them. This has a direct impact on the way they monetize. For one, it automatically invalidates one of the monetization models used by other networks — interaction taxes (or commissions). Since there are no direct interactions between participants, they cannot be taxed. So data networks are left with five of the six monetization models I have previously listed.
As I explained in the case of interaction networks and marketplaces, the choice of monetization model depends on the relationship between participants. This is because monetization (or capturing value) should be aligned with the product’s primary value proposition (creating value). But on a data network, users are only connected by crowdsourced information — the information they feed into the product and the information they consume. As a result, the only possible determinants of monetization are the method of data collection (data acquisition) and the nature of product usage (data consumption).
First, let’s take a look at the data acquisition. As I have previously explained, there are two broad ways in which data networks acquire data from users:
- Active crowdsourcing, where users have to actively and directly engage with a product for data to be collected, e.g. Tripadvisor or Waze. In other words, the quality of the product directly depends on user engagement.
- Passive crowdsourcing, where data is automatically acquired from all product adopters irrespective of direct engagement, e.g. Mapbox or XANT. In other words, the quality of the product purely depends on user acquisition, not user engagement.
Of course, these approaches are not binary. Some products like Moovit and Truecaller use a combination of both methods for different datasets and, hence, fall somewhere between these two extremes.
Next, data consumption can happen in two broad ways:
- Some data networks like Tripadvisor and Waze need to be used actively and intentionally by end-users to get value from them.
- Others, like Mapbox, are passively used and tend to be embedded in third-party products or workflows. While they may power some capabilities, they don’t need to be used intentionally by end-users to extract value from them.
Again, these two facets are not binary. Many products fall somewhere between these two extremes — for example, those that primarily provide alerts, e.g. Moovit.
Now let’s take a look at how data acquisition and consumption affect each potential monetization model.
Advertising requires a large number of highly engaged users. This is only possible with data networks that are actively used. In addition, this is most effective when engagement is self-perpetuating, i.e. engagement leads to more engagement. This only occurs when users need to actively feed data into the product to improve it for all users. In other words, advertising is most effective when a data network collects data via active crowdsourcing.
Tripadvisor is a great example of this. Users obviously need to engage with Tripadvisor to get value from it. In addition, this engagement is only possible when numerous contributors actively write new reviews. This results in a high volume of active engagement — and valuable advertising real estate. Waze is another example as it requires active engagement which is created by contributors reporting traffic incidents. This allows it to monetize the resulting engagement with location-based advertising.
Premium Network Tier
As I have explained before, premium network tiers onboard users with a free product and monetize with an optional, paid tier that offers enhanced interaction opportunities. But as we know, there is no way for users to directly interact with each other on a data network. Interactions are limited to feeding data into the product and consuming it. So in this case, a premium network tier would offer enhanced access to the data collected by the network. In other words, the free tier restricts data consumption opportunities until users upgrade to a paid tier.
As a result, it is difficult to use this model with data networks that need to be used actively to realize their value proposition (e.g. Waze). Actively used data networks drive engagement through data availability — restricting that availability would also handicap their value proposition. For example, Waze would see a significant drop in engagement (and conversion) if it limited data access by the type of incident reported. As a result, premium network tiers are a better fit for passively used data networks.
Note that premium network tiers only restrict data consumption, not data acquisition. All users — on both free and paid plans — can contribute to the data network.
Grammarly is a good example of a premium network tier in action. It is an AI-enabled writing checker and collects data from all Grammarly users (paid and free) to inform its writing suggestions. However, its free tier only checks writing for grammar, spelling, and conciseness. Readability, vocabulary, and genre-specific suggestions are only available to paid users. Grammarly is typically used as a browser extension or mobile keyboard replacement and automatically checks writing in the background. As a result, limiting data access does not affect engagement opportunities with the product.
Mapillary is another good example. It crowdsources street-level images from users, combines them with open-source maps, and extracts details like objects to create 3-D maps. Its map features are available to all users for free. However, imagery and derived map data are only available on its paid tier. Since its data is typically embedded in other products, data restrictions do not limit engagement.
Keep in mind that this type of monetization requires premium tiers to offer enhanced data access. It does not include premium plans that simply offer additional features (e.g. Truecaller) or increased seat/user limits (e.g. Mapbox).
Paywalled Data Network
A paywalled data network requires users to pay before accessing or sharing any data with the rest of the network. This creates conditions that conflict with active crowdsourcing. As I have previously explained, active crowdsourcing requires a meaningful number of contributors to kickstart a data network effect. Since contributors tend to be a small minority of users, data networks need to maximize the top of the funnel (overall user adoption) to increase the value of the product. Putting the entire network behind a paywall interferes with this goal. As a result, paywalls can only be used in conjunction with passive crowdsourcing, i.e. with products that automatically collect data from all customers.
In addition, it can be very difficult to combine active usage with the kind of passive crowdsourcing seen on paywalled data networks. Paying customers are often unwilling to share their activity data (e.g. sales intelligence or contacts) directly with others. They are paying to use your product after all and privacy is likely to be a purchase consideration. Passively crowdsourced data tends to be more palatable when it is used to inform a recommendation algorithm (e.g. XANT), improve the capabilities of an embedded product (e.g. Mapbox) or, inform automated alerts (e.g. Nexar). Each of these use cases describes varying degrees of passive usage. This is another requirement for using a paywall.
XANT (previously Insidesales.com) is one example. It is an AI-based sales engagement product sold on an annual subscription. XANT requires customers to grant access to their contact database and emails when they sign up. From this point on, it automatically collects data on all interactions between salespeople and their prospects/customers to inform its recommendation algorithm. These recommendations help salespeople target the right accounts and contacts to improve sales productivity.
Nexar is another example, except that a device purchase — a dashcam — acts as a paywall. The dashcam then auto-detects road incidents/hazards and shares them with other Nexar users in the area. Since data is collected automatically, the presence of the paywall does not strain liquidity.
Note that freemium pricing purely based on the number of seats or end-users (e.g. Mapbox) can also be considered a paywall, rather than a premium network tier. This is because limiting the number of users in a free plan also restricts data acquisition until a customer upgrades to a paid plan.
Complementary products are best described as paid add-ons that enhance the core value proposition of a data network (without varying data access). Put simply, they are standalone features or capabilities that increase the value a customer gets from the product. Since this does not improve data quality or access, it is difficult to implement in products that “work in the background”. Users have to actively use the product to get value from these add-ons. And, as we saw with advertising, this is most effective when engagement begets more engagement, i.e. it is a better fit for data networks relying on active crowdsourcing. That said, these requirements are less stringent than they were for advertising. This makes complementary products a good alternative when engagement isn’t high enough to create valuable advertising real estate.
Transit tickets sold on public transport apps like Transit and Moovit are obvious examples of complementary products. Their usage patterns fall somewhere between active and passive — certainly not enough to monetize with ads. Instead, allowing users to buy transit tickets in their apps enhances their value proposition while being aligned with their data acquisition and consumption patterns.
Some of Truecaller’s premium features — premium badges, premium support, call recording, advanced spam blocking, and removing ads — can also be described as complementary products.
Finally, derived products are the last and most flexible monetization model used by data networks. A derived product leverages the interactions and engagement on a network to produce an asset that can be monetized directly. On data networks, this amounts to productizing data generated by free users into a standalone product for third parties. This is the hallmark of a derived product — paying customers are different from users generating the data. That said, the users generating data still need to consume it for a feedback loop (and data network effect) to be present.
Since monetization is completely detached from the users who make up the data network, data acquisition and product usage have no impact whatsoever.
Moovit’s urban mobility analytics is a good example of a derived product. Moovit collects anonymized location data from its users and also has a community of users who update data about public transport links. It combines these datasets to calculate real-time congestion, arrival times, etc. This is then packaged as an intelligence offering sold to city authorities, public transport agencies, etc. Nexar’s Citystream and Mapillary’s data subscriptions for cities are similar offerings.
Along the same lines, Zoominfo monetizes passively acquired data from its Community Edition product with a paid-for sales intelligence database. And Estimize monetizes its crowdsourced stock ratings by selling data access to quantitative hedge funds and other institutional investors.
The Data Monetization Map
The monetization map below summarizes the interplay between data acquisition, data consumption, and each individual monetization model.
As we can see, there isn’t much overlap between these monetization models, save for two exceptions. First, advertising and complementary products can sometimes co-exist — and are usually presented to users as alternatives. For example, Truecaller monetizes with advertising, but also has an ad-free option available for a subscription (complementary product). Second, derived products are independent of data acquisition and consumption patterns, and so can be combined with any monetization model except for advertising. This is because the data used for targeting is a competitive advantage to attract advertisers — monetizing that data can conflict with the goal of maximizing advertising revenue.
To wrap up, data networks can choose between five out of six possible monetization models, depending on the method of data acquisition and consumption. The only way to change these constraints — and your monetization options — is by layering new types of network effects. For example, TripAdvisor and Mapillary have layered on marketplaces to unlock a commission-based revenue stream. Similarly, Truecaller layered on an interaction network by enabling contact requests — this allowed it to monetize with a premium network tier as well, i.e. offering enhanced interaction opportunities for a fee.