What it means to be a cryptographer

Allison Bishop
Proof Reading
Published in
8 min readSep 28, 2020

When Michael Lewis came to visit the IEX office in 2016, he had his young son in tow. While he was signing my copy of Flash Boys, he asked me what I did for IEX and what I had done before. “I’m a quantitative researcher here,” I said. “But I’m also a cryptographer.”

“Could you explain to my son what a cryptographer is?” he asked. I stammered something about “protecting information,” and his son looked bored.

“A cryptographer is a person who designs puzzles,” Michael Lewis interjected.

This answer was obviously much more concrete and audience-appropriate than mine. And yes, designing puzzles is what cryptographers do. We use fancier words like “encryption” that sound very adult, but “puzzles” is a fair characterization. Still, there’s something I don’t like about this answer. It leaves out the why.

In my more cynical moments, I’m tempted to reduce the why to “publishing papers and getting paid by universities, funding agencies, or corporations.” But in my more idealistic moments, I feel it is much deeper than that.

A cryptographer is someone who facilitates interactions. We design encryption methods and other communication tools so that people can communicate in the ways that they want, while still protecting information from exposure to unintended recipients. As a community, cryptographers believe that people deserve to communicate and interact on their own terms, in ways that are authentic and protected from outside manipulation and surveillance. We don’t always (or often?) live up to this mission statement, but we do try. And we grapple very earnestly with the core tension of our role: as we work to build protocols that remove the need for people to trust intermediaries, corporations, and governments with their private data, how do we get them to trust us? Why should people trust their most sensitive interactions to the tools that we design?

This is perhaps the most beautiful thing about cryptography. To protect your secrets, we release ours. We publish our algorithms in full detail. We expose them to protracted public vetting. We try to break each other’s work, and when we succeed, we help rebuild it. We reject the false premise that obscuring our methods would make our algorithms more secure or protect our users. Do you want to know how your credit card number is protected when it travels across the internet to purchase something? You can find out! You might fall asleep in the middle of the long technical explanation, but it’s out there.

This is not a decision that the cryptography community makes lightly. There is a reason we roll our eyes whenever some “innovator” says he won’t publish his encryption algorithm because the secrecy “adds” a layer of security. What it does is subtract a layer of scrutiny. Given enough time and examples, it is typically possible to reverse-engineer the algorithm anyway. And once this happens, an algorithm that hasn’t been widely vetted is much more likely to reveal exploitable weaknesses.

It may seem magical that it’s possible to design encryption algorithms that can be fully disclosed to the public and still work to protect information. But it is made possible by a conspiracy between mathematics and randomness: some mathematical processes that are fully known and specified can be rigged to produce outputs that look completely random, as long as just one small initial secret input is unknown. This allows us to orchestrate grand choreographies of secrecy, all dancing on the head of a pin in broad daylight. It’s also validation for every disgruntled math student who has declared in frustration: “This might as well be gibberish!” It’s not. But we can make it look like it if we want to 😉

And since we know it is possible to design public encryption algorithms that hold up to decades of scrutiny, we have come to demand it. The expectation that strong cryptography will be publicly vetted protects us from snake oil salesmen and their secret formulas. “Nah, I’ll just use AES,” is something you can say to seem cool when confronted by such people at dinner parties.

The protection of transparency was something I took for granted until I ventured beyond the cryptography community. In algorithmic trading, I found a much different world. Like cryptographers, brokers play the role of facilitating interactions. They design “algos” that institutional investors use to communicate their interests to the market and execute trades. But in this world, phrases like “secret sauce” are venerated rather than ridiculed. Vetting of brokers is accomplished through noisy and manipulable metrics rather than evaluation of designs and processes. Snake oil salesmen abound.

Must this world truly be so different from cryptography? I don’t believe so. After all, the goal of avoiding information leakage while trading is essentially the goal of blending into the random-looking activity of the market. If my trading behavior truly blends in, then you shouldn’t be able to see it, even if I explain to you how I accomplish the blending. The when and what I am trading become like the small secret key in cryptography upon which everything rests, but the how becomes like the known mathematical process that mixes that secret with the inherent randomness of the market until it is unrecognizable.

What makes us think this is impossible? What makes us convinced that the principles of trading algorithm design deserve to be shielded from scrutiny? And who benefits from that shield of obscurity?

One typical argument goes like this: institutional clients benefit directly from the secrecy of broker algorithms, because more transparency would lead to more gameability.

Let’s deconstruct this argument a bit. First, from our own experiences in the industry, we suspect that there aren’t that many core ideas floating around in the agency algo design space, and a lot of people end up doing pretty similar things. Fundamental changes to algos may not happen very frequently, and this is with good reason: careful evaluation, implementation, and testing of any changes should be done thoroughly and not rushed. Overall, this may be an environment where most algos are variations on a few core themes and relatively stable over time. These factors do not bode well for secrecy of design as a mechanism for preventing gaming. Distinguishing between a relatively small number of possibilities with a relatively large amount of time and examples is a best case scenario for reverse-engineering. Also the number of people in and around the industry who have a basic knowledge of how algos tend to work is pretty sizeable, and this is good for other reasons. Within a company, it’s good to have many people involved in algo design, testing, and operation to achieve better quality and service, and across companies, it’s good to have this knowledge dispersed so there is healthy competition. What this suggests though is that secrecy of algo design is a pretty thin shield indeed. It may buy someone a little time or help protect a tiny broker whose flow is a mere drop in the ocean and hence not worth noticing, but it’s not a sustainable primary strategy for protecting client orders from predatory behavior in the market place.

Another typical argument goes like this: institutional clients benefit indirectly from the secrecy of broker algorithms because secrecy protects brokers’ ability to monetize their innovations, hence incentivizing advances in overall performance.

Let’s examine one big assumption that underlies this argument. The assumption is that better performing algos will draw more institutional flow. Under this assumption, algo design insights are a monetizable asset, and hence protecting them from competitors becomes a rational thing for companies to do. It’s then possible that the marketplace as a whole benefits from this secrecy, if the opportunities for protected profits incentivize more innovation than would occur in a more transparent environment. But we are getting ahead of ourselves. This base assumption is pretty questionable. “Better performing algos will draw more institutional flow” itself assumes that institutional clients can reliably discern differences in algo performance and respond accordingly. But teasing out subtle performance differences from the noisy outcomes on the sample size of one client’s trading is an extremely difficult task! Large clients who devote considerable resources to this are in a better position relative to smaller outfits, but even for large, well-resourced clients, it can be hard to make high confidence assessments.

This results in an environment where monetary reward is a very noisy function of quality. This dilutes the incentive to innovate and creates a fair amount of inertia. If reactions to performance variations are delayed and muted by low confidence, large and established brokers will naturally maintain market dominance without too much effort. Innovation is stifled.

Some might say that little can be done about this. The market is noisy, they say with a shrug. What can you do? But actually, there’s a pretty simple answer. If it’s hard to tell two algorithms apart due to small sample size and noisy outcomes, we can stop looking at just outcomes and start looking at the algos themselves! If the algo designs were disclosed, we could actually compare the ways that they operate. And we could go even further and demand to compare the different scientific processes that led to the different algo designs! This is not a new idea in other domains. Every math teacher on the planet will recognize it as the “show your work” approach.

Interestingly, “show your work” is also a defense against cheating and copying, as it makes the copying much easier to identify. Hmm. So maybe transparency isn’t actually an enemy of monetization of innovation? Let’s take a moment to think through this last point. “If you publish your algos,” the typical wisdom goes, “people can copy them.” Well, yeah. But if you go further and publish your entire research processes, it becomes pretty easy to tell the legitimate thing from the copy. If you were an institutional client deciding which broker to use, would you want to go with the team that has a high quality scientific research process up and running and churning out innovations, or would you want to go with the team that waits for those innovations to be published and then copies them? Which do you think is more likely to perform better over the long run?

Even in the cryptography community, where there is a widespread commitment to transparency of finished algorithm designs, I encountered a variety of opinions concerning the transparency of ideas that were not yet finished. Some people freely shared half-baked thoughts in the hopes that the wider community would take them up and help make them whole. Others horded them, afraid that exposure would lead to other researchers stealing and copying them before they could be finished. Stealing certainly happened from time to time, so this was a rational response, but there was also a deeper pattern. The people afraid of sharing ideas in progress tended to be the people who were most afraid that each good idea would be their last. The best scientists were often the most open. It was if they were saying, “steal this idea, if you dare. But everyone will know, and besides, I’ll just have a better one.”

In time, I hope institutional clients will begin to demand transparency of algorithms and research processes from their brokers, just like savvy consumers of cryptography do. But for that to become possible, they first need to be offered a choice of a transparent product that does not sacrifice performance. Providing this is Proof’s goal. Over the next few months, we will be releasing a thorough accounting of our initial trading algorithm, as well as the research process that lead to its design.

Who would be crazy enough to do that while everyone else still hides in the shadows, you ask? Sounds like a job for a cryptographer.

--

--