The Process of Research, and the Research on Process

Allison Bishop
Proof Reading
Published in
5 min readDec 18, 2019

Whitepaper detailing Proof’s approach to research through stock market simulation from historical data: https://prooftrading.com/docs/simulation.pdf

I got my first taste of live scientific research when I was 18. It was the summer of 2002, and I was an intern at the Jet Propulsion Lab, “helping” with the final field test of the NASA Mars rover before it launched into space. I put “helping” in quotes because despite my best intentions, I did not help overly much. I was assigned the job of matching up the data coming back from the rover with the initiating commands that generated it, so that the scientists could more easily navigate the various images and measurements collected. I was told to send my work to a particular email address, which I dutifully did each day. The days were long and grueling for the scientists, and I was proud to be a small part of making them easier. I watched the team meetings with fascination, and tried to stay firmly out of the way. At the end of the summer, I attended a wrap-up meeting where the scientists brainstormed about improvements they could make for future field tests.

“You know,” one scientist said. “It would really help if we had an intern matching up the data coming back from the rover with the initiating commands.”

My jaw dropped. Turns out they’d given me the wrong email address.

Soon after, the Mars rovers, Spirit and Opportunity, were launched and achieved momentous success. They explored Mars for 6 and 14 years respectively, and jointly traveled over 32 miles. We as a species learned a lot from them. And I as an individual learned about failures of process.

This sat in the back of my mind throughout my academic career as a designer of encryption algorithms. It bugged me that academic theory wasn’t grappling with the reality that most cybersecurity failures occurred around failures of human processes, rather than failures of math. The theory excused itself by explicitly stating its assumptions up front, and washing its hands of reality’s stubborn refusal to satisfy them.

As I transitioned to doing research in data science, I became even more concerned with issues of process. I missed the days when unrealistic assumptions were at least made explicit. In data science, many assumptions are rampantly made implicit: in particular, the assumptions that the underlying data is normally distributed, or that “statistically significant” phenomenon are unlikely to be due to chance.

The notion of statistical significance is a particularly insidious problem. It was developed in a era when data was hard to come by, and the number of tests that scientists would run on it were much more harshly constrained by time and computational resources. Naively carrying this same standard into an era of big data and cheap computation has contributed to a replication crisis in multiple scientific fields (see articles here and here for example).

To understand how this may have happened, we have to rid ourselves of the fantasy that scientific truth can be interpreted divorced from the research process. We have to acknowledge that process shapes meaning. In this case of statistical significance, the process of publication can invisibly skew results.

A “statistically significant” result with a p-value of 0.05 is supposed to mean that “there is only a 1/20 chance that this phenomenon is due to chance.” That sounds pretty meaningful, right? But suppose we have a lottery where each person has a 1/20 chance of winning. If 20 people play, there’s a very high chance that someone will win. Now if we print the name of the winner in the paper the next day, we don’t suddenly conclude that his win was meaningful and unlikely to be due to chance. But in the some sense, this is exactly what we do if we publish scientific studies based on whether they pass the p-value threshold of 0.05. Since there’s so much data now and so many studies, it’s not at all surprising that some (or even many!) published “statistically significant” results are attributable to chance. These meaningless results then fail to replicate in further studies.

This kind of problem is exacerbated in quantitative finance, where there are many reasonable versions of many reasonable metrics that one can run on data to plausibly draw conclusions. In this case, the cost of running a single test can be much, much less than the cost of writing an entire scientific paper. One doesn’t need to imagine a vast conspiracy to explain why many quantitative finance claims are spurious: well-meaning quants may simply run too many tests in aggregate for a notion like statistical significance to save them from traps of coincidence. And naturally, they show you only the successes, and not the failures or the underlying processes.

So what are we to do? How are we to conduct scientific research in a more meaningful and accountable way? The scientific community at large is exploring many approaches. One of them is making publication decisions for the designs of studies up front, before the data is analyzed, and committing to publication regardless of the result. This can reduce the selection bias that currently leads to meaningless “positive” results being published and meaningful negative ones going unpublished. Another direction is designing statistical methods themselves to be more robust and to counteract biases of process (a couple of recent papers I’ve read in this direction can be found here and here).

All of these approaches are promising, but there is much left to be worked out. One thing is clear though — process matters. And the more visible we make our process, the more equipped we will be to recognize and address its limitations.

In this spirit, Proof will be detailing each stage of our research in public. Today we take the first step, releasing a whitepaper describing our initial approach.

This is obviously a work in progress, and our approach will certainly evolve over time. By committing to our thinking at each stage, and explaining how and why it evolves, we think we can give you a much better basis for understanding and judging our product then if we waited to show you the end result.

Probably there will be a lot of steps though, and a lot of changes to track. Maybe you can find an intern to do it for you. Just make sure you give her the right email address.

--

--