ArXiv and blind review do not mix.
Blind review is the cornerstone of much of the NLP community; it is how conferences and journals recognize the value of people’s work. Reviewed publications are used for academic advancement (getting a PhD, getting tenure, getting grants, etc.), for highlighting work that the community thinks is important, and for helping young or less-visible researchers establish themselves in the community. Getting internships, PhD admittances, and faculty jobs all hinge on successfully getting one’s work through the review process. If this process is unfair, we amplify existing biases in the community.
At the same time, preprint servers like arXiv enable rapid dissemination of ideas and have led to an acceleration of progress in NLP, especially when papers on arXiv are linked to code repositories on GitHub, which is increasingly common. All else being equal, speeding up science is a great benefit to society.
The trouble is that all else is not equal when papers are published to arXiv before they have been reviewed. These papers have an advantage, especially if they come from well-known groups that can get a lot of publicity for their paper on Twitter or other media. Researchers on the Semantic Scholar team at AI2 showed that posting papers on arXiv before review is associated with a measurable increase in citation rate. There are numerous other reasons that people of all experience levels might want to put their paper on arXiv before it has been accepted for publication at a conference; I won’t repeat them here, as they can be found easily in other places.
People in well-established groups recognize the incentives here, and I have recently talked to several people in some of these groups who are strongly considering always going straight to arXiv, then submitting to the next allowable NLP venue. This breaks blind review — any paper that has ever been posted publicly, especially to a feed-based system like arXiv, cannot be reviewed blindly. A reviewer’s job is to stay on top of current work in the field; an ideal reviewer would be aware of work in their area that was posted publicly, and thus would have already seen the (non-anonymous) paper when it came time to give it an ostensibly blind review.
I’m not faulting any of these researchers that go to arXiv first — theirs is a rational response to the incentives put in place by our processes, which unintentionally encourage going public before blind review. If we want to fix this, we need to change the incentives, not hope that researchers choose to put the good of the community above their personal interest.
I propose to fix the incentive problem with a series of incremental steps.
1. Ban arXiv completely for all papers wanting blind review
If a paper has ever been posted to arXiv, it is not blind, and we should not pretend that it is. Papers should only be posted non-anonymously after they no longer need to be reviewed (either because the paper was accepted somewhere, or the authors have given up on review). This is the only way to maintain the integrity of the blind review process.
Making this change without any mitigation runs the risk of bifurcating the community — many in industry, or researchers who are already well-established, might decide that they don’t need official conference publications to get recognition for their work, and embrace arXiv without review. Again, some are already considering this, even in our current structure. Banning arXiv entirely would surely push more people towards this route.
The next steps try to get back most of what arXiv gives us, while preserving blind review.
2. Move to OpenReview for all reviewing
On OpenReview, anonymous versions of papers are available to everyone immediately upon submission, and after review they become de-anonymized. If a paper gets rejected for any reason, the ideas are still out and can be cited if some in the community find them useful. This solves many of the problems with conference submissions ending up in reviewing purgatory, never seeing the light of day because of noise in the reviewing process.
3. Make all deadlines continuous
While using OpenReview for all conference reviewing would be an improvement, it still leaves a substantial period of time that an author has to wait in between when they think a paper is ready for others to look at and when it finally gets de-anonymized. This is especially true if a paper is finished in a long gap between conference deadlines — you have to wait until the next deadline, then another couple of months for the review process to finish. This is the time during which an author seriously considers bypassing review, because they can get a head start on others and accrue citations and mindshare by just posting their paper on arXiv.
If we make all NLP venues have continuous submission deadlines, where review happens as soon as a paper is submitted within some window (say, three months before the current deadline) and papers are de-anonymized immediately upon an accept/reject decision, we remove much of the incentive to bypass review, because you don’t have to wait as long for the paper to become public. This is surely a bit of a logistical headache, but journals are already setup like this — we know how to make this happen. There is reason to believe this might also improve the quality of reviewing, as reviewers would not have a huge list of papers to review all within a two week period, but instead have much more spread out responsibilities.
4. Add an RSS feed to OpenReview
A key benefit of arXiv is that people can follow it and get updates for when papers are submitted that are related to their work. If there is a single deadline when all papers are submitted to a venue, a feed doesn’t really make sense — you just check OpenReview the day after the submission deadline and see all of the papers. But if you have continuous deadlines, people can choose to be notified of any (still blind!) paper that comes in to OpenReview that’s relevant to their interests. This will give much of the same motivation to submitting your paper to OpenReview as there is with arXiv — your paper gets pushed immediately to anyone who wants to look at it — it’s just still anonymous.
5. Figure out how to speed up the review cycle
The last benefit of arXiv is its ability to dramatically increase the rate of progress in science. With an anonymous feed on OpenReview, ideas would be disseminated very rapidly, but (1) authors still have to wait possibly months to publicly claim their work, which provides an incentive to use arXiv instead, and (2) it is very hard to anonymize good code, so links to code are mostly removed in anonymous submissions.
If we can speed up the turnaround of blind review from a couple of months to a few days or a week, I think the last legitimate incentives to bypass review are removed.
This is where my proposal gets a bit speculative and could use some more development by others in the community, but I think the steps above already get us close to something that could be used to significantly speed up turnaround.
I’ll preface this idea with a simple anecdote. In the last few days two papers were posted on arXiv that described new reading comprehension datasets. This is exactly the area that I am actively working in right now, so within hours of seeing them I had read them and formed initial opinions about them. I suspect I am not an outlier in this regard — if you don’t follow arXiv and read papers that look relevant to you, ask your students; I’d bet a majority of them do. After reading the papers, I wrote short threads about them on twitter. I didn’t do this as thoroughly as I would have an actual review, and everyone’s names were public and some of them are my friends, so there were issues about actually saying everything I thought (my positive comments did reflect my honest opinions). But if there were a system in place that I could have seen the papers anonymously and given an anonymous review, I would have done it, and within a matter of days.
I think we can leverage people’s reading of paper feeds to get a reviewing system with a very fast turnaround.
If we have already followed the above steps, then we have a feed based on continuous OpenReview submissions, so the system is already in place to make this happen. We just need reviewers to take on papers themselves instead of getting assigned by an area chair, with oversight for conflicts of interest and someone making sure that all papers get sufficient review. Strong incentives for reviewers to pick up papers, like best reviewer awards in several different categories, would help the system move along quickly. If we have broad participation from the community, there should be a large pool of available area chairs and reviewers, so any one person being on vacation or swamped with other work shouldn’t be that big of an issue.
The last couple of years have shown pretty clearly the problems with our current (well-intentioned) approach to reconcile pre-print servers with blind review. I think there are better alternatives available, and simple steps that we can take to get us to a system that is more equitable for everyone, while still providing all of the benefits that pre-print servers give us.
My thanks go to the many people who influenced this post: those who engaged with me on Twitter, particularly Ryan Cotterell, who forced me to articulate my views more clearly; those at AKBC 2019 who humored me and convinced me that writing this up in long form was a good idea; and many colleagues at AI2 who read and gave feedback on an earlier version of this post, including almost the entire AllenNLP team. Noah Smith’s feedback in particular had the biggest influence on how this post ended up. The views expressed here should be taken solely as those of the author.