Introduction to Ticker Plant and Distribution in the Cloud

FactSet
FactSet
Published in
5 min readJul 4, 2023

FactSet’s ticker plant cloud journey started like many in the financial data space. We started from the distribution architecture leveraging the cloud’s vast global distribution network and datacenters to deliver data as close and efficiently to clients as possible. The next problem we wanted to solve was to reduce latency between the clients and their local exchanges.

Traditionally to take exchange feed data one would have to put a point of presence (PoP) near the exchange and then deliver it from there through their distribution network to the client. In areas where there is a high density of markets like the U.S. and Europe it involves setting up a few PoPs in major financial centers and nearby markets can flow through those. For example, the ping times from London to Frankfurt are under 20ms. However, in areas like Asia where the markets are spread out the distances, and therefore latency is large. One can no longer have just a few PoPs to meet these latency requirements. For example, the ping time between Tokyo and Hong Kong is closer to 50ms, Tokyo to Mumbai is over 120ms. This quickly becomes costly and difficult to maintain. Leveraging the cloud regions allowed us to quickly get closer to these markets and shorten the distance the data travels to local clients. After a few migrations we concluded that feed handling in the cloud would work for smaller feeds, but how would this work for a larger feed?

Which feed?

Most large feeds are multicast, but the cloud is not currently set up for multicast ingestion at scale. There are two standard ways to get around this, tediously partitioning the data and trying to use virtual routers to get the data piecemeal into the cloud or taking data on premise and sending it up to the cloud via TCP. We went with the latter, where we can ingest and convert to TCP on premises and leave the more complex logic and computation for the cloud.

With many feeds to choose from, one jumps out as worth approaching; U.S. Options. The U.S. Options feed from the Options Price Reporting Authority (OPRA) is one of the largest in the world. It is the largest of these workloads, which in aggregate can have microbursts over 30 Gbps and is spec’d for 100 Gbps. The data comes in on 48 multicast lines from the exchange and is periodically rebalanced to better distribute traffic. These rebalances occur just a few times a year in between which data distribution can become quite imbalanced between the multicast lines.

Leading up the rebalancing on March 27, 2023, we see for one of the lines’ data is peaking over 100 Mbps and then after it drops to about 40–50Mbps:

Zooming in on the Friday before rebalancing on March 24th

We see peaks over 100 Mbps. However, the following Monday after the rebalancing:

The ability to rightsize on a per-line basis and grow things as needed is an attractive aspect of the cloud. On premise, there were two ways to approach the issue, one was to have larger servers with multiple lines running on them or to have many smaller servers per line. The issue with the first approach resulted in spending a lot of operational time moving lines around to make sure each server as an aggregate had a decent load spread across it. The issue with the latter is that often you had many servers that were over provisioned, which comes with the issues with having more servers in general as far as maintenance. By using the cloud, we can put one line on each instance and then resize that instance as the data on it grows thereby eliminating many hours of manual work or overspending when compared to on-premise.

The setup

Since the code was already fairly modular because we did need to move it around frequently on premise, we spent most of our time configuring it and writing Infrastructure as Code (IaC) to get it running in the cloud, as opposed to actually changing the code itself. Once that was done the setup was simple. Take the data on premise over multicast so we could arbitrate and overcome the limits of the cloud, do some basic conflation like we do today, and send it up to the cloud for more CPU-intensive processing.

We chose the c5n series, which was computer and network optimized. We started with all the hosts of the “smallest” size, the c5n.large, and checked the capacity charts to see which lines/instances were bottlenecked and for which metric and kept sizing. In the end, we settled on three sizes. This process on prem could have taken months, with different hardware ordered, vetted, and installed, we finished it all in a few weeks. We set up proper alerting so as lines grew, we knew well in advance when we move from one size to the next for the following day. We immediately realized increased stability. While we had moved to the cloud, we left our on-premise setup as a fallback. Within a month, we started seeing our capacity alarms fire causing the usual “process shuffle” on premise. However, in the cloud, we did not see this. When we did see an alarm trip in the cloud, we looked and changed the instance size for the following day and were done.

Where do we go from here?

We are anticipating the communicated OPRA line doubling at the end of July and by being in the cloud we are ready for this. We can run our instances with reasonable sizing and rapidly adjust and right size for the new setup once we familiarize ourselves with it. Other than for the feed handling on prem, there is no concern to get the sizing right in the cloud day one for the processing. We will start big and move down as we need to and adjust on the fly, showing the flexibility of the cloud in action.

Author: Anant Singh (Principal Software Architect)

Editors: Gregory Levinsky (Marketing Content Specialist) & Josh Gaddy (VP, Director, Developer Advocacy)

--

--

FactSet
FactSet

FactSet delivers data, analytics, and open technology in a digital platform to help the financial community see more, think bigger, and do their best work.