Cryptographic preregistration: from Newton to fMRI
Say you’re a PhD student and you’re interested in the neural processes that are involved in reading. You’re about to start running a functional MRI experiment to test the idea that reading a text between quotation marks activates specific brain regions in the parietal lobe that are associated with the ability to understand others. Being a graduate student in the post replication-crisis era, you took every precautionary step to prevent yourself from confusing hypothesis-generating with hypothesis-driven results. Together with your supervisor, you decided to run exactly 30 subjects, documented your specific predictions and decided on all the details of your preprocessing pipeline and exclusion criteria. You have your ethics approval, your behavioral experiment is working, and everything is ready to start running the experiment.
The most important part is behind you: you are now protected against fooling the person that is easiest to fool, aka your own self. Even if you find compelling results in a different brain region (that makes a lot of sense when you think about it) or by excluding some data points from the analysis (subjects were clearly tired by the end of the experiment) — you will have your original plans to remind you that these results are exploratory, and are not really hypothesis driven. But how can you communicate this fact to your readers? You want to convince skeptical readers that you had decided on these specific hypotheses and analysis before seeing the data, rather than chose the ones that worked best with the data you had. One way would be to explicitly write in your paper which details were decided in advance and which were conditioned on the data, but then again — given what we know about common practices in scientific report, readers would be naïve to take your words at face value. You want to be able to prove that those decisions were genuinely made in advance (i.e., are time-locked), without having to rely on your readers’ trust in your honesty as a person. But how?
One approach would be to preregister your study to an online platform such as the open science framework or asPredicted. These platforms invite authors to submit a document specifying the details of their experimental design, predictions, and analysis. Upon submission, authors are asked to testify whether or not they started to collect data for their study. The document is archived in an online repository together with the date of submission, so that authors can later use it to prove that they had everything specified by the documented preregistration date.
But does this kind of preregistration really solve the original problem? The answer is no. Authors can still ‘preregister’ their post-hoc study plans after collecting and exploring the data, and falsely tick the ‘No’ box to the question ‘Have any data been collected for this study already?’. In other words, this type of preregistration is just as trust-based as no preregistration at all.
A second approach would be to submit this study to be reviewed as a Registered Report, before collecting any data. Registered Reports are scientific papers that, in addition to the later review phase, are also peer-reviewed before data collection. This way, editorial decisions are made based on research quality and importance, rather than outcome. The additional peer review step draws a clear line between hypothesis-driven and exploratory results: analyses that were not described in the original submission are regarded as exploratory.
The Registered Reports scheme is very compelling for many reasons — it is immune to publication bias and it motivates accurate rather than sensation-seeking scientific report. Nevertheless, it will require you to expose your research plans to peer reviewers at a very early stage of work, potentially exposing you to the risk of getting scooped. When the only purpose is to time-lock study plans with respect to data acquisition, RR might be an overkill.
This can be a good point to pause and reflect on our problem and the suggested solutions. You wanted to be able to prove that some aspects of your study protocol were determined before data exploration. osf-like preregistration was not helpful, because it could not guarantee that data was not already collected and explored by the time of preregistration. Registered Reports were more helpful in that respect, but necessitated compromises to your scientific autonomy and confidentiality. What would be great to have is a time-locking mechanism that is both valid and can be performed in-lab, without the involvement of a third party. Maybe take a few minutes to think about it before you move to read about our proposal.
In a 1677 letter to Gottfried Leibniz, Isaac Newton included the following cryptic sentence:
“ The foundations of these operations is evident enough, in fact; but because I cannot proceed with the explanation of it now, I have preferred to conceal it thus: 6accdae13eff7i3l9n4o4qrr4s8t12ux.”
Fearing that Leibniz might scoop his idea, Newton chose to include only an encrypted version of it, namely the count of different letters in his original sentence (6 a’s, 2 c’s, 1 d, etc.). In this way he communicated to Leibniz that by the time of writing he already had his theorem sorted, yet revealing almost nothing about the contents of his ideas (more here).
By exposing the original sentence years later, Newton provided Leibniz with a proof for the order of two events: the completion of the sentence (event 1) must have occurred before the writing of the letter (event 2). More generally, by introducing a causal link between the original sentence and the letter (changes to the original sentence might change the letter count, which would in turn change the final letter, and all in a predictable manner), Newton time-locked the two events with respect to each other. Can we do something similar in order to time-lock our study plans with respect to data collection?
A naïve translation of this scheme to our case will go something like this: encode your study plans in a string, maybe something similar to Newton’s letter count, and then post it online, for example on your twitter account. You can now provide a link to this tweet in your manuscript, and readers will know that you had your study plans sorted at the time of twitting. This translation is of course wrong, because just like in the case of osf preregistration, as far as the skeptical reader is concerned nothing prevents you from twitting your predictions and analysis plans after exploring the data. In other words, this proposal time-locks the decision on study plans with respect to the event of twitting, but not with respect to the event of data acquisition. In order to achieve the latter, you need to make your experimental data causally dependent on your study plans.
This causal link can be most easily introduced by making the order and timing of events in the experiment a function of the study plans. By creating a mapping from study plans to experimental designs, the acquired data become causally dependent upon the predetermined study plans (via the experimental design), which time-locks the study plans specification and makes it very difficult to change them in retrospect without breaking this causal chain.
To come back to the original problem, in order to convince your skeptical readers that you committed to your predictions and decisions before data acquisition, you can make the order and timing of events in your experiment a function of those very predictions and decisions. What follows is a description of some of the technical details of our implementations, but the central part is captured by the bold words above.
Our implementation rests on the properties of two entities: cryptographic hash functions and pseudo-random number generators.
Cryptographic hash functions map input of any size to a sequence of bits (zeros and ones) of a predetermined length. What makes these functions special is that it is very easy to use them to transform any input to a sequence of bits, but it is practically impossible to find two inputs that are mapped to the same sequence of bits. In our implementation, we use this function to transform the folder containing the predictions and study plans (the protocol folder) to a sequence of bits (the protocol sum), in a way similar to how Newton translated his original sentence to a cryptic sequence of letters and numbers.
Pseudorandom number generators (PRNGs) are the computer’s way to simulate random behaviour. When we ask a programming language to draw a random order of events for our experiment, the PRNG that is implemented in this language is following a deterministic algorithm that spits out a perfectly predictable output. To control its behaviour, the PRNG can be initialized with a number (the initialization seed). Following initialization, calls to the PRNG will be determined by the choice of the seed. Often experiment codes begin with an initiation of the PRNG, so that the order and timing of events is varied across subjects in a reproducible manner (seed initialization commands are rng(10) in Matlab, set_random_seed(10) in Presentation, and random.seed(10) in Python). The seed numbers are usually chosen arbitrarily.
By now you might already understand where this is going. If we use the protocol-sum as our seed for the PRNG initialization before randomizing the timing and order of events in our experiment, the experimental randomization will be causally dependent on the contents of the protocol folder. We assume that our data is affected by the specific choice of experimental design, and together this makes the experimental data a function of our study plans.
Just like Newton’s idea was encoded in his Letter to Leibniz, your predetermined study plans are now encoded in the records of the brain activity of your participants. By making your data available upon publication and linking to your protocol folder, you provide skeptical readers with a way to validate the integrity of your preregistration: they can generate the experimental design by themselves and see if it aligns with the data. With a sufficient number of possible experimental designs that can be generated by your randomization code, as is often the case in neuroimaging, even the most skeptic of your readers should be convinced.
But even more important, by using this scheme you communicate to your general audience that it would be irrational of you to be dishonest about your original plans and predictions. To game this scheme you will have to either manipulate your data, to lie about what actually happened in the experiment, or to lie about what your experiment randomization code is doing. All of these lies can potentially be detected by the community and cause much more inconvenience than simply admitting that certain decisions were made post hoc. This is not the case for osf-like preregistration, where all that it takes to game the system is to tick the wrong box.
All the details that I skipped here are included in our preprint on bioRxiv, followed by an interesting discussion with Chris Gorgolewski. We provide Matlab, Python and R functions that initialize the PRNG with the hash value of the input. We also ran an actual fMRI experiment and preregistered our analysis plans and predictions using our scheme —the paper can be found here. We then pretended to be a skeptical reader and used various techniques to verify this time-locking — a full documentation of the process is available here.
My brother noammaz came up with this idea while we were having a walk with our dog, B7. We developed the idea further with the guidance and support of my mentor, Professor Roy Mukamel.