Proof Engineering: The Algorithmic Trading Platform
How we built an institutional-grade algorithmic trading platform in the cloud
This is a semi-technical post about how we built an institutional-grade algorithmic trading platform from scratch in the cloud. As much as we want to, we cannot possibly include all of the details in a single post, so this is more of a high-level post, and future posts will talk about the individual topics in more technical depth.
Why Build It?
A High-Performance Trading System in the Cloud
The Equities Ecosystem
Cloud Selection (Our Choice: AWS)
Extranet (Our Choice: TNS)
Market Data (Our Choice: Exegy)
The Tech Stack
High Performance Java
Cloud Setup and InfoSec
The OMS and the Algo Engine
Infrastructure and DevOps
Why Build It?
“Finance is a bloated industry” — is the first line on our website. At least part of the premise of Proof is that we know this industry really well, and we know where the excesses and conflicts-of-interests lie, and we should be able to methodically build a leaner version of an equities agency broker that employs all of the tools and tricks that it needs to, but none that it need not. Our intention is to start very simple and add on complexity (and thereby, cost) as it is proven to be needed through rigorous research and/or analysis (e.g. what are the scenarios where depth-of-book market data adds to the trading performance? ok, now let’s prove that answer rigorously!). For us, we needed to be the architects of our minimalist vision in a very direct way.
If we had licensed or partnered with an existing or off-the-shelf platform, no matter how flexible, we’d have been constrained in the solutions that we can dream up by an external system that we don’t fully understand.
A High-Performance Trading System in the Cloud
Any industry insider knows about the technology arms race that is pervasive in the industry. Some participants spend tens, if not hundreds, of millions of dollars a year on fancy infrastructure to gain an edge of nanoseconds over other participants. This may make sense for some firms on the street, with a specific business model and trading strategy, but for a vast majority of the participants, and certainly for agency brokers like us, we do not think it makes any sense. It is wasteful at best, and harmful at worst (who do you think ultimately pays for these platforms?).
We decided to go the other way completely — not only did we reject the arms race, but we set out to demonstrate that if we’re smart about our order placement logic, and if we use existing investor-friendly tools available on the street effectively (e.g. D-Limit/D-Peg at our previous company IEX, or M-ELO at Nasdaq, to name a couple), we can achieve superior outcomes for our institutional customers with a lean platform in the cloud.
What about latency in the cloud?
This is a common question that we get, and Dan has written a whole post on Does low latency matter on the sell-side? The answer to that question, in our opinion, is “it depends on what you’re doing” and “probably not”.
This does not mean, however, that we do not care about latency within our system. We have built a true high-performance distributed system where most operations inside the system complete within tens-to-hundreds of microseconds. At the same time, yes, we do not mind the handful of milliseconds that it takes for us to communicate with the street (receive market data and send orders).
Here are some reasons why we don’t think we need co-located servers and the fastest lines:
- Most of the arms race is in reactive trading. As an agency trading platform, we can design our algorithms so that we don’t necessarily need to react to events. Trading on behalf of our clients, our goals include getting the best price without leaking too much information, but we are not looking to harvest rebates.
- If somehow the tick-to-trade time did matter, we believe that getting colo space, the 40Gbps lines, and the fastest prop feeds, would still not help. It is nearly impossible for agency brokers to compete with the fastest participants on the street, simply because of the nature of our trading activity (risk checks, for one!). These latency-sensitive scenarios tend to be winner-take-all, and being just good is not good enough. There is no second-prize award.
- Most trading strategies can be divided into the macro-strategy (time horizon of minutes or hours) and the micro-tactics (time horizon of microseconds or milliseconds). This macro part (the “algo”) is not latency-sensitive and is where the high-level trajectory of the order is computed — including order schedules, market impact estimates, etc. Using this intelligence, the algo decides when and how much of the order should be sliced and, which tactics should be used to execute those slices. Some of these tactics are indeed latency-sensitive, and for those, we intend to use existing facilities on the street that we trust (e.g. the IEX router routinely captures ~100% of displayed liquidity when sweeping the street; we know this, because we built it at IEX!).
The Equities Ecosystem
The Proof Trading System runs inside a private network in the AWS cloud, but much of the equities ecosystem is still deployed in proprietary data centers in New Jersey. We interface with various entities in equities ecosystem, including the various trading venues (Exchanges and Dark Pools), our real-time market data provider, and the Execution Management Systems (EMS) that our clients use.
Below is a schematic diagram of the Proof Trading System and the ecosystem it is embedded in. We will talk about some of these components below in this post, with other posts to follow with technical details of the rest.
Cloud Selection (Our Choice: AWS)
Back in late 2019, we did a quick bake-off between the three major cloud vendors: Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure.
We had a few different criteria upon which we would select our cloud provider:
- Performance: not just raw performance, but consistency; we wanted our VMs to feel as close to bare-metal as possible. Our ranking: AWS, then GCP, then Azure.
- Usability: administrative console and command-line interface (scriptability). Our ranking: GCP, then AWS, then Azure (which was just… No. Azure lost big on this point.).
- Pricing: they’re basically all similar. Compute resources are a bit cheaper on AWS, but network (egress) is more expensive. This ended up not being a deciding factor in the end.
For performance, the primary benchmark consisted of reading a disk-based data store containing 100 million records (88 bytes each) and streaming the records to a remote client using TCP or UDP. We tried to keep the machine types and other parameters as consistent as possible (e.g. 16 vCPU, local disks, non-dedicated machines). Below are some of the results:
- Replication across servers in the same availability zone with attached disks: 6.3M msgs/sec (did not matter if machines were in Cluster or Partition Placement Group)
- Replication across servers in different availability zones with attached disks: 3.3M msgs/sec
- EBS vs attached storage: EBS storage could keep up in terms of throughput with attached NVMe storage, but NVMe was more consistent and lower latency
- UDP (unicast) worked out of the box (slightly slower than TCP)
- Ping time between boxes: ~50μs for same availability zone, and ~400μs across availability zones. Using an Echo benchmark, where two applications send messages back and forth, we were able to replicate this ping time as the RTL (round trip latency) for messages between two servers over TCP as well as UDP.
- We could easily pin threads or interrupts to specific cores (including HT peer cores), with predictable behaviors as we would expect from a physical machine.
- Replication benchmark performance using TCP: 6.3M msgs/sec
- UDP didn’t work out of the box
- Noticed throttling on the boxes — rates would degrade systematically over the duration of the test
- Ping time between boxes: ~160μs
- When pinning threads to specific cores, the benchmark did not complete
- Benchmark performance using TCP: started at 6.5M msgs/sec but before halfway, fell to 3M msgs/sec
- UDP worked out of the box and roughly same perf as TCP
- Ping time between boxes: inconsistent 600μs — 2ms
- After running the first test, the machine became unstable and had to be restarted
- During the 2nd run, we had a 135s pause in the middle of the test (unacceptable!)
- Of all 3, Azure was the worst experience when trying to get help from tech support. It was clear that they are set up for enterprise customers, not start-ups.
Extranet (Our Choice: TNS)
As alluded to above, traditional finance companies have yet to adopt the cloud for their main business operations. If we were a regular old broker with a rented cage full of server racks and network equipment in an Equinix data center in Secaucus, and we wanted to connect to every equity trading venue in the US and tens of clients, we would be looking at managing hundreds of individual cross-connects coming into our system. And with that come the operational headaches (and costs) of managing BGP sessions with counterparties and owning public IPs and a host of other issues.
Once we selected the AWS cloud, we set out to find an Extranet partner who could abstract out all of these connectivity requirements for us. We didn’t quite know what the final solution would look like, and after talking to at least five different telecom/extranet providers who are the leaders in Equities connectivity, we realized that no one did; this just hadn’t been done before.
The most commonly proposed solution was for us to rent half a rack in a data center with a couple of routers, and interface them with one of the extranet providers for most of our connectivity. Another proposal was to use the Equinix Cloud Exchange (now Equinix Fabric?) or a company like Megaport in some form. We could not get comfortable with these solutions either for being too much of a hassle or being too expensive. Ultimately, we found a partner in TNS who was willing to go the extra mile and design an elegant and cost-effective solution for us.
TNS engineers worked with us to create a pair of redundant/diverse cross-connects between TNS and AWS dedicated to Proof, terminating on one of the multi-tenant switches on the TNS network. Once the requisite AWS Direct Connect configuration was complete (which deserves its own blog post!), we had a /27 block of (private) IPs in our AWS environment that could talk to the TNS network, and in turn, almost any other entity on the street. We were able to quickly set up connectivity to Exegy (our market data provider), Credit Suisse (our DMA provider), and IEX.
Market Data (Our Choice: Exegy)
Our system is almost entirely built from scratch, but we did want to use a vendor for the market data feed handlers. We’ve seen how nuanced this work can be, and it requires extreme care and a lot of maintenance/testing to be performant and accurate.
We had two main requirements for our market data provider:
- Legit low latency provider: They have to be a professional market data vendor that processes market data in single-digit microseconds with extreme reliability, and are used on the street for order routing. This is in contrast to a ton of web-based real-time vendors in existence today that stream market data using WebSockets/SSE or even using cloud-hosted Kafka instances. From what we understand, the latency on these ranges from tens of milliseconds to multiple seconds under load, which is not suitable for trading.
- Hosted solution: We needed this to be a hosted solution because we couldn’t install any market-data appliances, or specialized network cards, or even run any ticker-plant feed handler software. In fact, given that all of the common market data feeds are UDP-multicast based, we couldn’t extend those feeds to the cloud at all.
We are aware that there are at least a few solutions that meet these needs, but without doing much of a bake-off, we selected Exegy because of our prior experience with them. We know them to be fast and reliable, and they haven’t disappointed.
We have measured the latency for SIP quotes to be in the 3.3–4ms range consistently, which according to online sources, is perfectly reasonable, and probably hard to beat. Once a quote is received over the wire into our AWS environment, the NBBO can be computed and disseminated to our algo trading engines in ~50μs in most cases.
We’re using the “sequenced stream” architecture, which many folks in finance are intimately familiar with. For those who are not, this is a very popular architectural pattern for distributed systems that has its origins going back to Lamport scalar clocks (1978). In the more modern version, it is combined with the Event Sourcing pattern to create a distributed system where all nodes can have a perfectly synchronized state by virtue of having processed identical inputs in identical order.
Here’s a quick description of how this works. Every input into the system is assigned a globally unique monotonic sequence number and timestamp by a central component known as a sequencer. This sequenced stream of events is disseminated to the nodes/applications in the system, which only operate on these sequenced inputs, and never on any other external inputs that have not been sequenced. Any outputs from the applications must also first be sequenced before they can be consumed by other applications or the external world. Since all nodes in the distributed system are presented with the exact same sequence of events, it is relatively straightforward for them to arrive at the same logical state after each event, without incurring any overhead or issues related to inter-node communication.
At first glance, the use of a central sequencer component may seem strangely limiting, until you realize that middleware with a central broker have been popular for decades now. You may wonder if this is the same as any topic-based message bus out there (or even Kafka). The primary difference is that topic- or channel-based middleware do not maintain the relative ordering of messages across topics or channels. Think of the sequencer as an extremely fast single-topic broker with persistence and exactly-once delivery semantics. We would love to compare this design to other solutions in another post dedicated to this topic, but for now, we’ll just mention that this architecture can be extremely scalable, resilient, and performant (think millions of messages per second with a low-single-digit microsecond latency through the sequencer, even on a cloud VM). Kafka does get within an earshot of being able to support this design, but falls short in raw performance.
Here are some of the benefits of this architecture:
- Perfect synchronization: Every node in the distributed system receives an identical stream of inputs in identical order. If the nodes of the distributed system are written to be completely deterministic and not rely on any external inputs (not even local time!), it is possible to achieve perfect synchronization (consensus) among an unlimited number of nodes. This benefit alone is worth all of the troubles; in contrast to other traditional designs, the different applications in the system are never out of sync and never need reconciliation.
- Perfect observability: If the sequenced stream is persisted to a durable medium reliably, which it is in our case, we can achieve perfect observability. Imagine a situation where you observe some unexpected behavior in the system. The usual mode of debugging may be to check the logs and the database and conduct a forensic exercise to piece together the cause of the behavior. In our system, we can just retrieve the sequenced stream file and replay it through the same code that is deployed in production, but inside a debugger session. We can see the exact flow of logic and even the individual variable values, exactly as they were in production. We don’t have to spend hours attempting to reproduce a race condition, we have perfect reproducibility each time.
This enhanced observability also extends to aspects such as performance monitoring. The apps as well as the sequencer add enough telemetry to the sequenced messages to be able to precisely locate bottlenecks and queueing in the system.
- Perfect auditability: With perfect observability comes perfect auditability. We never have to guess or piece together what the state of the market was at the time an order was sent, or how that state came to be. We have definitive answers to such questions.
- Perfectly streamlined processing: Since all of the system inputs and outputs are recorded on the sequenced stream, it is trivial to delegate housekeeping tasks like logging and database insertions to separate non-critical apps. Critical path processing of orders and market data is streamlined to the point that an individual application can process events within microseconds. And all this while maintaining perfect observability, as outlined above.
Read our detailed description of the home-grown message bus that powers this architecture here: https://medium.com/p/proof-engineering-the-message-bus-a7cc84e1104b
The Tech Stack
- Operating System: Amazon Linux 2
- Programming languages: Java (SE of course, nobody in their right mind still uses EE), Python, Typescript
- DevOps: Jira, Bitbucket, Confluence/Notion
- Databases: SingleStore (fka MemSQL) for trading system, RedisDB for UX, OneTick for historical research
- Clustering: None for the trading system, AWS ECS for UX
- Operational Tools: AWS CloudFormation, Ansible, Jenkins, DataDog
- UX Technologies: Node.js, React/Redux, AG Grid
High Performance Java
The core trading system is primarily written using Java. Given the performance needs of trading systems, there is (used to be?) a constant discussion of C++ vs Java in finance. At this point though, the argument has mostly been settled in favor of Java — we’ll just say that Java can get very close to C++ in performance once the JIT compiler kicks in, and it is like a hundred times easier/safer to write code in Java. I think a more intriguing language for some of the system-level pieces is Rust, but we simply aren’t proficient enough in Rust to use it as our primary language.
Having said that, we don’t really use Java, at least not as it was intended. We don’t use many of the signature Java facilities, such as the memory management, or the foundation classes (e.g. collections). Here are some of the things that we do differently:
- All code is single-threaded. Java has perfectly usable concurrency primitives, but keeping things single-threaded is not only safer (no synchronization bugs), but also much more performant (a simple thread-context-switch resulting from a call to
Object.wait()can cost as much as 20ms in Linux). If we must use multiple threads for some reason, we use a Disruptor-based ring buffer to pass messages across threads.
- Avoid garbage collection. Java is infamous for unpredictable stop-the-world garbage collections. The best way to avoid GC is to not create garbage in the first place. This topic could fill a book, but the primary ways to do that are: (a) Do not create new objects in the critical path of processing. Create all the objects you’ll need upfront and cache them in object pools. (b) Do not use Java strings. Java strings are immutable objects that are a common source of garbage. We use pooled custom strings that are based on
java.lang.StringBuilder(c) Do not use standard Java collections. More on this below (d) Careful about boxing/unboxing of primitive types, which can happen when using standard collections or during logging. (e) Consider using off-heap memory buffers where appropriate (we use some of the utilities available in chronicle-core).
- Avoid standard Java collections. Most standard Java collections use a companion
Nodeobject, that is created and destroyed as items are added/removed. Also, every iteration through these collections creates a new
Iteratorobject, which contributes to garbage. Lastly, when used with primitive data types (e.g. a map of
Object), garbage will be produced with almost every operation due to boxing/unboxing. When possible, we use collections from agrona and fastutil (and rarely, guava).
- Write deterministic code. We’ve alluded to determinism above, but it deserves elaboration, as this is key to making the system work. By deterministic code, we mean that the code should produce the exact same output each time it is presented with a given sequenced stream, down to even the timestamps. This is easier said than done, because it means that the code may not use constructs such as external threads, or timers, or even the local system clock. The very passage of time must be derived from timestamps seen on the sequenced stream. And it gets weirder from there — like, did you know that the iteration order of some collections (e.g.
java.util.HashMap) is non-deterministic because it relies on the
hashCodeof the entry keys?!
Cloud Setup and InfoSec
We’ll quickly touch upon how our system is laid out in the cloud, as there is some trepidation about security in the cloud. Having seen how data center networks are designed, and the first-hand experience with RegSCI, we can confidently say that cloud-based systems can be as secure as their on-prem counterparts. (Or perhaps the flip side is more true — on-prem systems can have the same or worse vulnerabilities than cloud-based systems — the devil is in the details).
Here are some of the details of how our system is laid out in production:
- Our entire system is deployed in private subnets, which means all hosts in the system use private IP space. There are no public IP addresses exposed to the internet.
- The system is accessed, whether for administrative purposes or for UX purposes, over a VPN connection. We have a completely offline machine with our Root CA, that is used to sign the VPN client certificates.
- Our system follows the NIST reference architecture, which prescribes having a management network separate from our production network. We use 3 separate VPCs (Virtual Private Clouds) in AWS — one for our trading system, one for our web/UX system, one for the management network.
- Access to the trading system or the web system servers is through a pair of jump hosts (aka control hosts, or management hosts), access to which is tightly controlled, and the list of authorized keys is refreshed every 12 hours to remove any inadvertent grants of access.
- Access to every individual server is protected using AWS Security Groups, NACLs, and a distinct set of authorized SSH keys, also refreshed every 12 hours.
- We use the principle of least privilege, not just for users, but even for the apps. Even within our private network, we limit the communication between servers to specific TCP/UDP ports (e.g. we disallow SSH between hosts).
The OMS and the Algo Engine
So far, we’ve talked about things from an architecture/system level. Let’s talk about the OMS and Algo Engine from an application perspective.
We decided early on that fundamentally, the OMS and Algo Engine are the same component, except that the actions that they take on the orders are different. Both the OMS and the Algo Engine require order state management functions. Aside from that, an OMS may take simpler actions on its orders such as forwarding them to an algo engine and relaying any fills back, while an Algo Engine may execute elaborate trading strategies. However, qualitatively, they’re doing the same kind of work, and so we built these components to use the same code.
We think of the Algo Engine as a container for trading strategies, whether they are “OMS Strategies” or they are algorithms like VWAP/TWAP/IS. The purpose of the algo container is to provide facilities or services to the strategy for doing what it needs to do.
Here is the basic description of these services:
- OMS: All algos need extensive OMS features such as accepting and validating orders, amendments, and cancel requests, as well as sending out child orders, amendments, and cancel requests. In addition, when fills are received from the venue, they need to be relayed back upstream.
- Market Data: This algo container allows the strategy to subscribe to and receive market data from the rest of the system. In our system, a strategy can subscribe to market data for any reference security, not just the security of the order.
- Risk Checks / Validation Engine: The algo container ensures compliance with certain invariants at all times. For example, at no time may the open child orders for a strategy add up to more quantity than the parent order quantity. Similarly, the algo container will ensure that no child orders or amendment requests violate the parent limit.
- Static Data / Algo Ref Data: The algo container will facilitate loading and access to static data such as security master, venue configurations, destination preferences, symbol statistics, volume curves, and any trading models. The system is set up to allow the strategy to load any delimited file as ref data, without any code changes to the core system.
- Timer Service: This doesn’t sound like anything worth writing about, but we mention it because it is an important algo “trigger”. In our system, a strategy can request an unlimited number of wakeup calls, each accompanied by a token payload that serves as a reminder for why the wakeup was requested. These wakeups/timers fire in accordance with time as observed on the sequenced stream (in a deterministic fashion).
- Child Order Placement: The algo container provides facilities to help the strategy with venue and destination selection (which are two separate concerns). The container keeps track of which venues are accessible via which destinations, whether the venue or the destination is down, and even round-robin across connections when sending to a particular venue.
Financial Information eXchange or FIX protocol is the native language of most equities trading systems. Even when systems like ours use an internal format (binary or otherwise), they are typically based on and incorporate aspects of FIX.
FIX is a text-based delimited key-value-pair request-response protocol. It is not quite human-readable, unless you’ve had the misfortune of having to spend hundreds of hours staring at FIX logs (has that ever happened to you? it happened to me). FIX is a multi-layer protocol — not only does it define the encoding format, it also defines a session-layer communication protocol, and an application-layer for working with orders. Over the last two and a half decades, numerous versions have been released, but FIX 4.2 (released in 2001) is the most popular one in equities.
Our FIX gateways are based on a popular Java library called QuickFIX/J. It is not the fastest FIX library out there, and there are other paid and free options, but it is likely the most complete of them all. For us, the fact that it is free, open-source, easy to use, and used by thousands of shops in production, was enough to accept the less-than-stellar performance.
Having said that, we did customize QuickFIX/J in one aspect — we changed how it generates session-level messages. We’ll detail this in another post, but our changes enable us to integrate QuickFIX/J with the sequenced stream architecture in such a way that we no longer rely on disk logs for recovery (which is how most FIX sessions recover). What this means in practice is that we can start/restart a client or venue gateway on any server in our infrastructure, and as long as the networking requirements are met, we can be sure that the FIX session will recover whether we have access to the session FIX logs or not. We can even start the session on multiple gateways and they’ll all stay in sync as long as they can read the sequenced stream (a hot-warm setup).
There is much future work left to be done in customizing the library still — we want to remove all of the extra threads that QuickFIX/J spawns, we want to reduce all of the garbage generation, and we particularly want to reduce/eliminate the extensive use of Java strings when working with FIX messages.
We service institutional clients, which means we interface with EMSs used by them. Depending on the EMS, we connect to them either over the internet (with TLS encryption) or via a FIX network.
Our FIX spec is available in either the PDF format or the ATDL format (Algorithmic Trading Definition Language). For vendors that have been able to integrate the ATDL, the integration appears to be a much simpler process. We have a modified atdl4j repo that we have integrated into our testing tools (Banzai from QuickFIX/J), so that we know exactly what the order ticket will look like once set up.
The biggest challenge when integrating with a vendor is working out the symbology they can support in tag 55/65. For example, Berkshire B shares may be sent in tag 55 as “BRK B”, “BRK.B”, “BRK/B”, or even 55=BRK and 65=B. We’ll talk about security master in more detail in another post.
We trade US Equities, and while there are nearly 50 equity trading venues in the US, we currently only trade on the 16 lit exchanges, with plans to access dark venues shortly (Update 7/15/21 — we access dark pools now). Even within lit exchanges, we are only a member of one exchange — IEX, and access the other exchanges through Credit Suisse DMA (Direct Market Access) or the IEX Smart Order Router. The FIX integration with both of these destinations has largely been smooth.
We will detail our UX strategy in another post, so this will be a high-level description. We have built a scalable multi-blotter streaming UX system to support our trading system. We have not exposed this to the clients, though we certainly expect to have a client portal in some form in the future.
Here’s the general flow of data: a Java app called UXState takes information from the stream, resolves all of the foreign keys (e.g. securityId → symbol, clientId → clientName), and puts an interpreted version of the data in well-known hashes and lists in Redis. These records and any updates are also published via Redis pub-sub. A node.js cluster subscribes to Redis and consumes all of these data records to build a local state. A react-based single-page app (SPA), which sports a configurable workspace with multiple streaming views, authenticates users using Auth0, and then requests data from the node.js server. An initial snapshot, followed by updates, are streamed back to the browser using SSE (server-sent events).
We use a home-grown layout engine at the moment, and we do not have a desktop or mobile app, but we’ll be looking into OpenFin shortly.
Here are some of the salient features of our home-grown workspace engine:
- Ability to dock views using drag and drop, with the ability to maximize views
- Ability to open the same view multiple times (e.g. open Alerts window and filter by “ERROR”, then open another Alerts window and filter by “Order Rejected”; now you have two custom Alert views)
- Ability to save derived views (e.g. filter Orders view by client=XYZ, then save the filtered view; now you have a dedicated Orders view for a specific client)
- Support for multiple types of views side-by-side (e.g. order blotter and a volume curve chart)
- Ability to detect and notify the user when data in the views may be stale
- Ability to export/import the workspace layout, including properties of individual views such as column positions / widths and sort orders
- Chrome Push Notifications
It is fair to say that a bulk of our UX build effort went into building the perfect blotter. We considered building our own grid from scratch, but after extensive comparisons with available options (future blog post!), we decided AG Grid fits our needs. AG Grid is not free, but it is affordable even for a startup like us, and it is endlessly customizable.
We had a ton of requirements for our blotters:
- Fast updates, multiple times a second. There should be no lag when navigating around the blotter as a thousand rows are being updated each second.
- Virtual rows/columns — this is a performance optimization useful for blotters with a large number of records. If the grid has a million records, but the viewport will only allow 10 rows to be visible, it is beneficial for the grid to only render the rows/columns that are in the current view. This is tricky to maintain, of course, as the user scrolls or filters or navigates to rows outside the view. AG Grid does a great job of making this seamless.
- Ability to select, sort, filter, and group on columns
- Ability to quickly search through the entire grid (not at a column level, just everywhere). We’ve supercharged this feature to allow Regex searches, as well as expression searches (e.g. symbol=’IBM’ && client=’XYZ’)
- Ability to see summary and counts in a status bar
- Ability to use custom renderers or specify CSS class definitions per row/cell based on conditional logic (e.g. if an Order is fully filled, turn it green)
- Ability to select rows or cells and copy selection
- Ability to add custom actions to the right-click menu (e.g. Orders view has a right-click option for Cancel Order)
- Ability to save/load current configuration of the blotter. This ties in with the workspace layout export/import feature mentioned above.
We did run into some performance issues with sorting in the grid. The issue was that the grid tried to sort the entire set of records on each individual record update. This is untenable in a situation where the grid has a million records but only 10 of them are receiving updates. But even here, the extensibility of AG Grid came in handy — it supports an externalized data model called the Viewport Row Model. This model pretty much hands over the steering wheel to us and turns the grid into a purely presentational component. This allowed us to implement more efficient incremental sort algorithms when handling updates.
Infrastructure and DevOps
DevOps always means different things to different teams, and this could be its own post, so I’ll just describe the high-level parts:
- We use BitBucket / Git to store all of our source code
- Bitbucket Pipelines are configured as our CI (continuous integration) tool. For each commit, it builds all projects, runs hundreds of unit and integration tests, and passes or fails the build.
- Official builds are created from check-ins to the main branch, and stored in a Maven repo in an S3 bucket. All official builds are tagged in the Git repo.
- Once an official build is available, and a Change Management ticket has been approved by relevant stakeholders, the release is deployed first to the jump host, and from there, to the entire cluster (using ansible). So far, our policy is that all parts of the system must run the same build on any given trading day. The workflow for UX builds is a bit different, since we’re deploying the build to AWS ECS in that case.
- Ansible is used for release and configuration management, as well as administrative tasks (e.g. clean-up old logs) both at the host level and the application level.
- Jenkins is used as a scheduler to orchestrate all production jobs such as deployments, system start/stop, post-trade, archive, and even patch management (yes, yes, Jenkins is not necessarily the best tool for this; we could be using other tools better suited for ops jobs, but we’re developers at heart, so Jenkins it is for us!)
- We use AWS CloudFormation for provisioning nearly all of our AWS resources. It works reasonably well, at least among the available options, though we certainly have our niggles with it (nested CFN stacks are impossible to manage; drift detection is broken; launching EC2 instances doesn’t work well over time; it sometimes wants to delete and recreate resources for the simplest of updates).
- Monitoring: We use DataDog for infrastructure as well as application monitoring. We push all of our application logs to DataDog using their agent, and are able to set up monitors for specific keywords (e.g. ERROR or “Order Received”). We also monitor CPU, Memory, Disk space on the servers, and process monitoring allows us to receive an alert if a critical process is dead. They can be a bit expensive and their contract process was a bit weird, but otherwise, we can wholeheartedly recommend DataDog vs a number of hosted ELK providers we looked at.
- Automation: We have automated nearly all of the operations tasks. Here’s a typical day in the automated life of our trading system: In the morning, a reference data set is produced and deployed to the relevant servers. The system starts up, connects to clients and venues, and begins accepting orders. It trades throughout the day with no human intervention (unless a supervisor deems it necessary), and at the end of the day, all open orders are canceled back to the clients. The system shuts down, performs all post-trade and regulatory tasks, and archives all of the logs/data, including to WORM storage for 17a-4 purposes. All of this without a human so much as clicking a button. We have appropriate safeguards at each step of the way, and a licensed human is always watching, but otherwise, we have automated humans out of the manual processes they usually perform.
Backoffice is often the unloved child of many a tech organization. Here’s the truth — these organizations are wrong; your so-called front-office revenue center could not function without whatever goes on in the backoffice. And yes, the problems being solved in the backoffice are every bit as weird and complicated as you see in the front office, especially if your backoffice is fully automated.
For us, backoffice means things like reference data, clearing & settlement, regulatory reporting like CAT & OATS, and supervisory reporting. A quick description below
- Reference Data: We create our security master by combining FINRA CAT symbol master, NASDAQ security master, and IEX Cloud ref data / prices [Shout out to IEX Cloud — I mean, really, is there anything as good and cost-effective for financial data as IEX Cloud? I asked around, and as an example, Intrinio was 80x more expensive for what we needed].
In addition to the security master, we generate symbol statistics and volume curves on a daily basis, and our dynamic VWAP model reference data periodically (using OneTick/Python). These are combined with static data files such as client/venue connections, to produce a reference data set for the trading day.
- Clearing & Settlement: This involves: (1) dropping our trades to Apex throughout the day via a drop copy FIX connection, and (2) sending trades and allocations to Apex via a REST API at the end of the day after appropriate checks/reconciliations.
- CAT: We produce CAT records in-house at the end of the day and publish files to FINRA via SFTP (FINRA handily supports an AWS PrivateLink connection).
- OATS: We produce OATS records in-house at the end of the day and publish files to FINRA via IFT (Internet File Transfer). A lot of people don’t know this, but IFT is fully scriptable, meaning there is a REST API that can be used to automate uploads as well as collect feedback. If you have access to FINRA IFT, point your browser to this link and be amazed at the swagger API docs!
- Supervisory reporting: Currently, we produce these reports: (1) a Daily Activity Report, that summarizes trading activity and flags any issues such as trades outside NBBO or possible market manipulation attempts (2) a Time Synchronization Report, which checks that our server clocks are synced to within 50ms of an acceptable reference clock. (Sidebar: the Amazon Time Sync Service is an incredible no-effort way of synchronizing clocks to within a millisecond of GPS time) (3) 606 reports which are generated upon client request and cover the preceding 6 months of trading data
If you got to this part after reading all of the parts in between, send me a note, and I will send you a medal (or at least, let’s please chat!). This is only but a summary of two years worth of the technology build, and I’ve still not described huge swaths of the system. And more importantly, there is a lot of work still left to be done to achieve our vision of building an industry-leading platform.
My closing thoughts are — if you think we’re doing some cool work and you can contribute — please reach out to us at email@example.com. Don’t worry about whether we have an “open role” on our website. If you are a technologist, you’re good at what you do, and want to help build a modern platform and have an impact, there is likely a role for you at Proof. To show our employees that we care and we appreciate, we make them true partners, with handsome equity grants, possibly larger than anything you’ve seen in your career.