Does anyone want an algo design whitepaper?

Allison Bishop
Oct 4 · 5 min read

There is a mantra at Y-Combinator, the startup incubator that Proof participated in during the summer of 2019. It is: “Make Something People Want.” It is compelling in its simplicity, but it’s not easy to do. It can serve as an anchor of accountability to your customers, counterbalancing a tendency to shape things too much to your own desires and quirks. It’s also a clean embodiment of its own point. One thing people want is catchy mantras.

As a statement of purpose, however, I find it very incomplete. People want a lot of things to be true that aren’t true. People want a lot of things to be easy that aren’t easy. People want a lot of things that are bad for them. “What can you do?” we might say with a shrug. “You’ve gotta meet people where they are.” Well yes. But you don’t have to stay there.

As a computer scientist, I spend probably too little time thinking about what I’m making and if people want it, and much more time thinking about how I’m making things. What’s my process for coming up with new model feature ideas to try? Are my robustness checks sufficient to keep me from falling down a deep rabbit hole of over-fitting? How many whiteboard marker colors can I use before it induces decision paralysis? And so I’ve spent much of the last year documenting my process of algo decision research, and writing this whitepaper on our new design.

There are risks, of course, to focusing on the how instead of the what. A big one is that people won’t care. “I don’t care how it works, just get it done,” is a common ethos of American business. There is a reason that phrases like “I get results” aren’t typically followed by elaborations like: “in an environmentally sustainable, ethically unassailable, certified organic, and locally sourced way.” But in the domain of algorithmic trading, how is everything. How is a trading algorithm going to make its choices about what to do in a given situation? How are we going to decide between different possible designs? And, perhaps most mysteriously, how are we going to formulate candidate designs in the first place?

The sheer volume of decisions that go into the design of a trading algorithm can be overwhelming. What is the overarching goal — to match VWAP in expectation? To minimize impact? To minimize variance around a benchmark? Next there are higher level scheduling decisions, like how much of this order should be traded over the next few hours or minutes? Then there are lower level tactical decisions, like how much should we post at a time and when should we take? There are many layers of periphery decisions as well, like what market data features should we track in real time and take into account?

The evolution and electronification of trading has spawned interesting blends of human and machine-driven decision making to answer these questions. A typical starting point is a human trader instructing an algo developer to automate his intuition. Once there is a fully implemented electronic algo, it can then be A-B tested against incremental variants and evolve iteratively.

From a business perspective, this process of algo development has several benefits. It ensures that the algos will (at least initially) behave in a way that is intuitive for human traders, and will generally match their expectations. It mitigates some of the risk of change, as it’s mostly doing an automated version of what humans were doing before, and it also leverages current knowledge bases and experience within the company. Also worth noting — it doesn’t really upend existing corporate hierarchies or render any internal stakeholders obsolete.

From a scientific perspective, however, this process is far from ideal. Human and machine decision-making each have their own strengths and weaknesses, and this arrangement does not really play to either’s strengths. Humans are good at recognizing general patterns even when the details differ, following heuristics to make snap decisions in unfamiliar situations, and spotting anomalies. But human decision-making is also prone to pitfalls of selective memory, cognitive distortions, and the strong tendency to fit data to a desired narrative, especially ambiguous data. Machines, however, are good at processing lots amounts of data, searching through vast numbers of possibilities, and navigating tradeoffs that are already quantitatively defined. Machines are not so good at ignoring unimportant details or balancing qualitative tradeoffs, and they can’t compensate for insufficient or low quality data (in fact, they often exacerbate problems with data quantity or quality by proceeding blindly on).

With this in mind, it seems like a poor division of labor to have human intuition solely shaping the initial algo design, and machines only being called in to evaluate results on the small sample sizes of A-B tests. In some domains of machine learning, state of the art results have been achieved by taking a nearly opposite approach: letting machines sift through the vast possibilities for models, and then presenting the candidate results for human inspection. In the case of trading algorithm design, however, the wholly machine-forward approach is likely to be plagued by problems with data quality or quantity and a low signal-to-noise ratio.

For this reason, we seek to navigate a middle ground. We don’t ask machines to run blindly through noisy, unstructured market data. Instead, we seed our design research with a blunt kernel of human intuition by condensing our historical market data into basic features that we suspect are important for modeling price impact. We next use machine learning (in our case, pretty basic statistics) to build models of price impact in terms of these features. We next test the robustness of these basic models on fresh historical market data to make sure we’re not chasing coincidences or being thrown off by changing market conditions. Once we have models of market behavior that we think are meaningful, we can use dynamic programming to project forward what we expect the impact of our trading actions to be, and use this to make scheduling decisions in ways that we think minimizes our impact, subject to our other goals (like completing an order quantity over the lifetime of the order).

It’s hard to know how this compares to competing algo designs because almost nothing about them is publicly disclosed. From a scientific standpoint, there’s nothing particularly new or surprising here. We haven’t pushed the boundaries of knowledge in machine learning in this algo design, as we don’t expect that would be wise or effective in such a noisy domain. But we are doing something unique — combining a rigorous scientific design process with public disclosure to set a new bar of accountability in the agency algo design space.

Is this something people want? There are likely a lot of people who don’t want it. Or who do want it, but not really enough to get deeply engaged. But I have to be honest, “make something people want” was never really my mantra. There is a different question I tend to orient my life around: “if everyone did what you’re doing, would the world be better or worse?” If everyone in the agency broker space took a more scientifically grounded and more transparent approach, would the industry be better or worse? Better, I think. And that’s enough for me.


Proof is a new institutional equities broker.