Building an Exchange

Jim Greco
5 min readDec 16, 2014

Part 1: Communication Protocols

Direct Match is launching the first all-to-all trading venue for fixed income. Our mission is to level the playing field for the buy-side in the US Treasury market and give them the same access to the market that the sell-side has always enjoyed. As the CTO I’m committed to making sure we extend this vision all the way down to how bytes get shifted around the system.

The Principals of an Exchange

There are six core principals of our exchange that guide the technical architecture decisions we’ll make:

  • Fairness — Subscribers should be able to receive market data at the same time and no single subscriber can unfairly dominate the exchange’s resources
  • Simplicity — The rules of engagement should be easy to understand and shouldn’t give an advantage to parties with different levels of sophistication
  • Recovery — Software breaks (a lot), but you can never lose a single trade
  • Latency — Customers are managing risk across many different venues and we shouldn’t introduce arbitrary delays into their trading
  • Determinism — It should be simple to explain to subscribers how and why all events on the system happen
  • Scalability — Exchanges should be able to scale to tens of thousands of subscribers with minimal marginal impact

Distributing Workloads

The workload of a trading system is typically distributed across multiple machines (scaling to thousands in the largest installations). Since you want to optimize every last nanosecond it makes a lot of sense to begin with the lowest level that we have control over: how all these machines talk to each other [1].

Communication over TCP

The way most computers talk to each other is point-to-point. It’s how you got to this blog. Your web browser initiated a TCP/IP connection with Medium’s server and Medium sent you back the HTML, CSS, JS, and images over the same connection. This works well for a web page because you’re not really competing against other participants for the web page’s resources (they get the same content whether you come in a millisecond before or 5 seconds later).

In an exchange though there is competition for scarce resources. A person leaving an order is only willing to do a specific size and as a result there is a big difference as to whether you come in 1 millisecond after someone else who wants the same resource. To demonstrate this point further, let’s come up with a hypothetical trading system where everyone is connected directly into the matching engine through TCP.

(i) Elite Trading sends a buy order (ii) Matching engine sends market data to participants in sequence (iii) Done-not-done and The B-Team send sell orders to match

If your communication protocol is based TCP/IP then the matching engine has to individually send a message to each participant who is connected every time there is an update in the market data. An order/trade/etc message is generated and then copied and pushed to The B-Team, the message is then copied and pushed to Punting-It, and finally Done-not-done receives another copy of the market data. All of this happens sequentially (and no, threads aren’t the answer) which violates our Fairness and Latency Principles. If you’re Done-not-done you are at a significant disadvantage to The B-Team in your sending the sell order. And if you’re the 1000th subscriber to receive the market data then you probably don’t think of the trading venue as very fair.

Furthermore, this approach scales terribly and violates our Scalability Principles. For one participant there is one message in and one message out. For ten participants there is one message in and ten messages out. For a thousands of participants you reinvent the crazy distribution hacks that you find in one too many publish-subscribe systems.

Communication Over UDP

TCP is definitely not going to work as a way for our servers to talk to each. Ideally, there would be another protocol that allows you to scale the number of subscribers with minimal impact to fairness, latency, and scalability. That protocol of course is UDP which allows one computer to “broadcast” packets to all other computers listening simultaneously.

(i) Elite Trading sends a buy order (ii) Matching engine sends market data to participants in parallel (iii) Done-not-done and The B-Team send sell orders to match

Now in our hypothetical exchange the different participants on the platform all receive the market data/trades at the same time. The exchange has introduced no arbitrary delays and it is entirely within the participants hands as to how quickly they respond.

It should be no surprise then that all servers in the Direct Match trading system communicate over a redundant UDP Multicast [2] bus. Participants on the bus (which are not subscribers directly, but proxies using standard financial protocols) all receive market data at the same time and have equal priority for interacting with the matching engine. The simplicity of this setup allows us to bring online new servers with limited new configuration and scaling issues.

Next Time

UDP also features a very minimal protocol so all the overhead of TCP (handshaking, packet sequencing, reordering of packets, re-sending dropped packets, and congestion control) is pushed up to the application level. This is great because it allows us to really cut down on latency in a controlled environment, but it has a lot of implications in how we design our messaging protocol. I’ll get into this next time when we start to dive into the messaging architecture.

--

--

Jim Greco

Wine collector, trading technologist, market structure enthusiast, and recovering rates trader.