Market data pricing: part 2 of many

Published in

Proof Reading

9 min readMay 6, 2019

Previously on market data pricing, we laid out the many input variables that can affect the market data bills that broker dealers pay to stock exchanges to consume quote and trade data. In this episode, we will lay out the common high-level structure and principles of fees across exchange families, and some of the differences between policies at NYSE, Nasdaq, and CBOE, and how policies have evolved over time.

The three major exchange families in this domain are kind of like the major health insurance conglomerates in the US: the details of the billing across exchanges are different, and their intricacies are each the aggregate result of many layers of incremental changes over a drawn-out history, but they all employ similar tricks and high-level constructs to result in a high bill, no matter where you go.

Our goal today is to orient ourselves in the high level concepts driving a market data bill, and prepare ourselves to fully dissect the details. There are different billing regimes for real-time data access vs. historical data, so for the next few episodes of our market data series, we’ll focus solely on the more expensive real-time data landscape.

Anatomy of a market data bill

All of the three exchange families charge real-time market data fees that break down into three main types:

1. flat fees for internal and external distribution

For any given data product, there is typically a flat fee that depends on whether the recipient is re-distributing the data externally or not. In the Nasdaq family, the flat fee also depends on whether the recipient is getting the raw data format output by Nasdaq, or something pre-processed by an intermediary. The flat fee portions for various data products are often several thousand dollars a month (ranging even above $10,000 in some cases), but can also be several hundred dollars or even $0 for some product/exchange combinations. Top of book, last sale, and trade feeds typically have lower fixed fees as compared to depth of book, and some exchanges, such as NYSE National and IEX, have no fees at all.

2. non-display fees driven by usage or infrastructure

For some data products, there is a non-display fee that is driven by categorization of the use cases of the data, or in the case of Nasdaq products, driven by the infrastructure that processes the data and performs computations on it. Non-display fees have a very wide range, from $0 on many products, to a few thousand dollars a month on several products, to as high as $100,000 a month possible at the highest end. Though admittedly the use cases near this point become a little silly — does anyone really need to operate three separate dark pools?

3. user fees that scale by people and/or accounts

There are also “display fees” for many products that are charged based on human beings looking at the data. The basic principle here is the same: the more eyeballs and/or more account credentials to grant access to those eyeballs, the higher the fee. These fees multiply with scale, typically in the simple form of (Cost per unit) X (Number of units). They start rather small if there are only a small number of humans (e.g. mere tens of dollars per person), but the multiplication by units can cause them to add up for big corporations, or for small corporations who re-distribute data to a large external user base. At the scale of say 100s of employees, these fees start to exert a comparable toll to the fixed distribution fees or non-display fees.

Implications of Market Data Fee Structures

It’s worth pausing for a moment to consider the effects of fee structures that include both significant fixed costs and significant scaling costs.

High barriers to entry

Flat fee structures can create a high barrier of entry to small, new entrants in the financial ecosystem. The flat fees for things like “internal distribution” and for vanilla non-display use cases like “agency trading” mean that an upstart broker consuming data products directly from the exchanges will rack up a high monthly fee from the first day of trading and continuing in perpetuity, even if its market share and number of employees/machines/etc. stays quite small.

We were most shocked to discover that this remains true for the SIPs (the Securities Information Processors). The SIPs were first established by regulatory action in the 1970s, and they were intended to represent a sufficient and reasonably accessible source of consolidated price data. As things stand today, there are two SIPs: one that is produced by NYSE for Tape A and Tape B stocks, and one that is produced by Nasdaq for Tape C stocks. The SIPs provide top of book and last sale information from each exchange. The flat fees an agency broker-dealer pays to consume both of these “public” data feeds from NYSE and Nasdaq in real-time add up to more than $10,000 a month. That might seem inconsequential for big firms, but it is a hefty fixed cost for a startup to add to its other expenses. Big Wall Street firms often talk about the SIPs in a light that paints them as cheap but inferior to the proprietary depth of book market data products that the exchanges offer. But it’s important to note that “cheap” here is a very relative concept, and > $10,000 a month is not a number to be ignored when thinking about how competition is enabled or disabled in the broker layer of the financial system.

Value extraction

Fees that scale rather than staying flat (e.g. the display fees and the Nasdaq non-display fees that grow with the number of servers) can have a different kind of effect on a market ecosystem. Scaling fee structures are rather common in technology services like cloud computing, where the cost of providing the service (e.g. the maintenance and energy costs of servers) scales naturally with usage. But it’s important to note here that the costs of extra hardware, energy, human management etc. for supporting more users past the point of data ingestion are borne by the company receiving the data, not by the exchanges providing the data. So the scaling nature of usage fees due to the exchanges is not a natural reflection of scaling costs.

From the perspective of an exchange charging such a fee, the scaling behavior may be a way of extracting higher fees from customers who are larger and can (presumably) afford to pay more. It also allows exchanges to profit off of the scaling-based business models that technology vendors may employ [think of a leech that is nourished by a thriving host]. And it serves to protect some market data revenue from consolidation effects: if two firms who were separately paying market data fees merge, the new bill for the now larger single firm will include only one instance of each flat fee (where before there were two), but the usage fees will be the sum of the prior usage fees. Well, not really, because some unlucky souls and/or machines were likely jettisoned as a result of the merger, but you get the idea.

In combination, the flat fees and scaling fees that comprise a real-time market data bill form a steep barrier to entry for new brokers, while somewhat limiting the incentives for existing brokers to merge. Not surprisingly, the fee structure overall reinforces the status quo, and does little to encourage evolution in the broker-dealer landscape.

Points of divergence between exchange families and inflection points in time

In trying to extract human-retainable meaning while wandering through the idiosyncratic tangle of market data products and policies currently offered by the exchanges, it helps to keep historical context in mind. Market data products, the systems used to access and leverage them, and their pricing policies are co-adaptive and reactive to each other. The scaling user fees are easiest to think about in the context of the Bloomberg terminal, which was designed with the goal of weaponizing market data in the hands of a human trader. As Bloomberg put it in his autobiography:

“Back then, most Wall Streeters didn’t understand the language of general-purpose computers. It wasn’t intuitive. … We built out own compact, low-priced workstations so we could give the reliability that a single-purpose, single-user machine provides. (PCs and mainframes have to do everything with everybody. By comparison, we, with our own ‘closed,’ custom-built hardware and software, could focus on a single task with perfect machine compatibility.) We designed our own color-coded, easy-to-use, small keyboard for the limited space our customers had in front of them.”

[Aside: I have to say that while reading this portion of Bloomberg’s book, I was struck by how much we have in common. It seems we both share a disdain for unmotivated complexity and generality in engineering design, as well as a deep-seated tendency towards an excessive use of hyphens.]

In the historical context of individual terminal machines used by individual traders, market data was perhaps originally viewed mainly as a tool to assist a human trader, performing largely the same tasks in a similar way as before, but with more ease and better data inputs. For Bloomberg to charge by the terminal certainly makes sense, and exchange user fees parasitically attach themselves to this intuition.

Units for “Users”:

This context also sheds light on how we got to the annoyingly muddled state of terminology in today’s market data policies, where words like “device,” “user,” and “subscriber” are used somewhat carelessly across exchange families in related but inconsistent ways. If we imagine ourselves in an earlier period of technology, say 1990 (wow, I feel old!), it would make sense to use words like “device” and “user” and “subscriber” interchangeably, as it was more typical for one human person to have one device for viewing stock data, and to access that device with one set of credentials. But nowadays, we have lots of diverging choices for how to count “display access.” Do we count the number of humans (which is what CBOE does)? Do we count the number of account credentials issued to the humans (which is what Nasdaq does)? (Nasdaq goes even a bit further into the weeds on this and considers user account capabilities for simultaneous accesses.) Do we leave our terms like “users” brazenly under-defined and refer any questions to “your NYSE Market Data Administration Account Manager” (which is what NYSE does)? [Aside: shout out to any elementary students who put down “NYSE Market Data Administration Account Manager” when asked what they want to be when they grow up! Way to dream pragmatically, kids! Also, “your NYSE Market Data Administration Account Manager” would make a great card for a Wall-street themed Cards Against Humanity expansion pack.]

But of course, flexibility in the definitions of units alone has not proven sufficient to keep up with technological change in the US equities market. A new type of algorithmic trading emerged — not powered by human traders pressing buttons in reaction to market data they could view on a screen, but rather powerful computers churning through vast amounts of data and generating orders on microsecond time scales, with no human in the loop. Naturally, the exchanges weren’t inclined to let the bulk of their market data revenue vanish into this algorithmic trading blindspot: enter non-display fees, which are a remarkably recent innovation in billing.

Non-display breakdown

For CBOE and Nasdaq exchanges, non-display fees are only incurred on the depth of book products. This is pretty targeted at algorithmic trading, as human traders are not likely to flash a full depth of book feed on a screen and make quick sense of it. For Nasdaq exchanges, the non-display fees depend upon the number of servers processing the data. For CBOE exchanges, the non-display fee is higher if a broker-dealer is using the data to run one or more of its own trading platforms, and lower if a broker-dealer is using the data only for other purposes (e.g. routing agency and/or proprietary trades). For NYSE exchanges, non-display fees apply to most products, and accumulate for each category of data use, where proprietary trading, agency trading, and execution platforms all count as separate categories.

As documented in a recent SIFMA report (https://www.sifma.org/wp-content/uploads/2019/01/Expand-and-SIFMA-An-Analysis-of-Market-Data-Fees-08-2018.pdf), the NYSE family began introducing non-display and re-distribution fees across its market data products in 2013. In recent times, the main exchange families have continually updated their fees, building a billing framework piece by piece and product by product that is designed to keep pace with technological innovation and extract fees from whatever infrastructures are adopted by their market data customers. Interestingly, the Nasdaq approach of counting physical servers may be already in danger of becoming out-dated, as its not clear how best to interpret or adapt this policy to the likely future of cloud computing in the US equities space.

The history of market data products and their pricing regimes is rich and strangely fascinating, and we will come back to it a bit later in this series when we discuss the regulatory framework for scrutinizing market data fees. But for our more immediate purpose of reaching a human comprehensible understanding of market data products and pricing, this is enough history and high-level context for now. Our next post will dig into a few flagship data products as examples and get our hands dirty in the details of exactly how much they cost under a reasonable range of business cases.