Conference Peer Review with the Semantic Scholar API
Semantic Scholar can help conference organizers find qualified and impartial peer reviewers for scientific papers.
Rodney Kinney is a Principal Software Engineer at AI2 and a founding team member of Semantic Scholar.
Global scientific output has been doubling roughly every nine years for the past several decades [1, 2]. This growth obviously presents a problem for researchers trying to keep pace with the volume of research being done in their field and adjacent fields. Less discussed is the burden on reviewers, who are charged with vetting an ever-increasing number of papers, often outside their area of specialization.
We can at least take comfort in the expectation that the number of qualified reviewers has grown together with the volume of submissions. But how to identify them? The volunteers responsible for assigning reviewers to submitted papers are unlikely to know the detailed expertise of each potential reviewer. As the pools of submissions and reviewers grow, the task of finding qualified and impartial reviewers for each submission becomes more difficult every year.
Fortunately, the current era of natural language understanding by computers offers hope. By using a topic model to compare a reviewer’s publication history with the text of a submitted paper, researchers are able to do well at finding qualified reviewers for a submitted paper . An obstacle to putting this into practice, however, is obtaining that publication history to begin with. Secondly, there is the conflict-of-interest problem: reviewers should not be making judgments on papers by individuals with whom they have a personal or professional relationship.
There are a variety of software aids to help conference/area/track chairs with managing the peer review process. These include EasyChair, CMT. OpenConf, and OpenReview. Some employ sophisticated natural-language analysis and matching algorithms, but bootstrapping the systems with the necessary data on each potential reviewer is a considerable challenge. Some systems integrate with the Toronto Paper Matching System (TPMS) . Participants create a profile with TPMS and manually upload a sample of their publications, creating a profile that can be used into the future. However, the manual process is cumbersome for authors; many choose not to participate, and the profiles are likely to drift out of date over time.
Semantic Scholar was launched by the Allen Institute for AI in 2015 with a mission to help scholars overcome information overload and locate the most relevant research. Through our agreements with all major scientific publishers, we have a comprehensive corpus of scientific publications, which we expose to users through our website, public APIs, and public datasets, while also publishing cutting-edge research. Our corpus includes automatically-generated author pages that are kept continuously up-to-date and are backed by state-of-the-art models .
For its 2020 conference, the Association for Computational Linguistics (ACL) partnered with Semantic Scholar to partially automate its peer review process . They used Semantic Scholar data for conflict-of-interest detection and reviewer-match score calculation. Registrants were asked to locate themselves via a search on Semantic Scholar, and to note the Author ID assigned to them via the URL of the author page. This is a very lightweight process and can be done without creating an account on the website. However, creating an account allows an author to manually correct any errors that the automated models may have made, and enables a range of useful features such as automated research feeds and new research alerts.
Using the Author IDs, the ACL organizers used the Semantic Scholar Open Research Corpus to fetch each reviewer’s publication history. The reviewer’s past co-authors identified potential conflicts of interest and a custom-trained language model was used to compute reviewer matching scores. The successes of the 2020 partnership led to a repeat in 2021 .
We are excited to release the first version of our new service specifically for peer review as part of the Semantic Scholar API. Providing this service as part of the Semantic Scholar API gives us the most flexibility to integrate with the wide variety of tools and processes used by the community.
This new service greatly simplifies the process used by ACL, making it unnecessary to train a language model or even to download any data from Semantic Scholar’s public corpus. Instead, organizers upload information about potential reviewers and submissions. The organizer supplies title and abstract text for the submitted papers, and the Semantic Scholar Author ID of reviewers and submission authors (if available), and the API computes conflict-of-interest and matching scores for all reviewer-submission pairs. Conflict of interest is based on co-authorship, and reviewer match scores are computed using SPECTER [8, 9], our state-of-the-art language model for scientific publications. All uploaded data is kept private, cannot be read by any other users of the API, and can be permanently deleted via the API when no longer needed.
For the future, we are collaborating on tighter integrations with conference-management software providers and partnerships with other conferences. We have achieved baseline accuracy on reviewer matching public datasets [10, 11], and are improving accuracy further as we roll out improvements to our author disambiguation models . We are also investigating using a full-corpus search to make reviewer recommendations from scratch.
Are you interested in running a trial of our new service? Browse the API documentation  or test out the command-line client , then contact our Partner team for a private account. We hope that this service can help conference organizers across all fields keep up with the accelerating pace of scientific discovery and we’re looking for partners to help shape the future of peer review.