Is the work of Nobel Prize laureates cited differently after they receive the prize?

Published in

scite

9 min readJul 14, 2021

Using publication records collected by Li et al (2018) for almost all Nobel laureates in physiology or medicine from 1900 to 2016, plus data from scite on more than 1.8 million citation statements to these nearly 30,000 articles, each of which is classified as supporting, contrasting, or mentioning a particular claim or result of the cited publication, I attempt to analyze how the body of work of winners of the Nobel Prize in Medicine is cited before vs after receiving the Nobel Prize.

Introduction

The Nobel Prize, one of the most renowned awards for scientific accomplishments, has itself been the focus of study by many researchers. Does receiving a Nobel Prize change how a laureate’s work is cited by others, and how does it alter the future path of their career? Researchers have attempted to answer these questions mostly by focusing on various citation metrics for the papers for which laureates were awarded the prize.

Last year, in April 2020, Li and colleagues published a great paper titled “Scientific elite revisited: patterns of productivity, collaboration, authorship and impact”. Building on the 1977 qualitative work of Harriet Zuckerman, whose paper “Scientific elite: Nobel Laureates in the United States” they allude to in their paper’s title, Li and colleagues have constructed near-exhaustive records for not only the prize-winning papers, but all the work that Nobel laureate have published in their full careers.

With this dataset of Nobel Prize-winners’ publication records, Li et al. (2020) show that Nobel laureates demonstrate high productivity and high impact early in their careers, before winning the prize, aligning with previous research suggesting that “great minds do their critical work early in their careers”. Early in their careers, future laureates, across all disciplines, publish more papers more frequently than their colleagues who do not go on to win the Nobel Prize. Additionally, their papers are 6 times more likely to be hit papers, defined as papers in the top 1% of citations.

Furthermore, Li et al. (2020) demonstrate that there exists a so-called “Nobel dip” in which papers published in the 2 years after having received the Nobel Prize actually have lower impact than papers published in the 2 years before the prize, but that impact later rebounds to typical levels.

Beyond citation counts

In nearly all bibliometric research, impact is calculated based on the number of citations. But this misses the purpose of a citation — within each paper in which a paper is cited, there exist different citation statements to that paper. For clarity, from here on I will refer to the paper that is cited as the cited paper, and the paper that cites the cited paper is the citing paper.

Within each citing paper, a cited paper can be referenced in different ways in different sections of the citing paper. For example, this paper by Evans and Kaufman (1981), for which Evans was awarded the 2007 Nobel Prize in Medicine, was cited in two different ways by this citing paper by Li, Zhang, Hou, et al. (2003):

scite analyzes the full text of scientific articles to extract all citation statements, and then classifies each citation statement as *supporting*, *contrasting*, or *mentioning*. This combination of citation statements and their classifications is called *Smart Citations*.

How are citation statements classified?

By analyzing the number and types of citation statements, we can discover much more useful information about the quality and the impact of scientific articles, and how these may change after an author receives a Nobel Prize.

Data

Using Smart Citation data from scite, combined with the Nobel Prize-winner publication records dataset from Li et al. 2020, I have shared a dataset of all citation statements to all publications contained in these publication records. By passing all DOIs from the Li et al. 2020 dataset as target papers, scite returns all citation statements that reference these target papers, along with the DOI of the source paper and the classification of the citation statement (for legal reasons, we cannot publish the text of the citation statements). For the 29,118 publications (with DOIs) by laureates of the Nobel Prize in Medicine, there are 1,859,852 citation statements.

This data will allow future research beyond traditional impact measurements, and allow us to see how Nobel laureates are cited. Is their work published after receiving the Nobel Prize more likely to be supported or contrasted?

Analysis: Nobel Prize in Medicine

I define a running variable, Years Since Prize, which is equal to the publication year of the cited paper minus the year that the author of the cited paper received the Nobel Prize. This makes possible aggregation of the data, and an easy way to look at various data over the 10 years before and the 10 years after receiving the Nobel Prize.

There is some specification error here, as the Nobel Prize is awarded in December of each year (and announced in October), but there is only year-level data. This means that for Years Since Prize = 0, we cannot systematically identify if a publication was published before or after the announcement of the prize. There are, however, 2–3 months (October through December) in which a citing paper could reference a cited paper that was written by a Nobel laureate of that year.

Grouping by Years Since Prize and by Prize Year, I first look at how the scite index, which is equal to the number of supporting citation statements divided by the sum of supporting and contrasting citation statements, changes in the 20-year interval around the prize announcement. Prize years for which the data is missing are dropped from the graphs.

Below are graphs of the two year scite index, or SI: the two year SI is calculated, using data from the current and the previous year, as (number of supporting citation statements)/number of supporting + contrasting citation statements). It represents, for papers published in a given two-year period, the level of support that those papers have received from later research.

The first thing to notice here is that there is not a clear trend in common with all of these different prize years. For some, after the announcement of the prize, the two year scite index seems to quickly drop, meaning that the number of contrasting citations to papers published in that two-year period had a larger percentage increase than the number of supporting citations to those papers. For example, for the work of the authors who received the 2001 Nobel Prize, the two year SI falls significantly for work published in 2001 and 2002, but then seems to return to the previous trend.

For other prize years, the opposite seems true; but for nearly all prize years, there is no obvious or drastic visual difference that is clearly outside of the general trend in SI.

Evolution of the number of citation statements by type

By aggregating all the prize years together, we can get a better idea of how citation statements to the work of Nobel laureates evolve over time. The scite index is limited in that it shows only a ratio, not the number of citation statements; so now, let’s first take a look at the total numbers of supporting, contrasting, or mentioning citation statements. Notice the very different y axes for the following graphs: mentioning citation statements are far, far more common than either supporting or contrasting.

Since the x axis represents the relative publication date of the cited paper, the number of all types of citations will decrease as we move to the right, since these are the newest papers and will nearly always have fewer total citations because of this.

Notice the upwards spike in all types of citation statements for those papers published after the Nobel Prize is awarded. When Nobel Prize-winners publish in the year following the prize, other researchers cite their work more — both supporting and contrasting evidence are higher in the year after the prize than in the 3 preceding years.

Evolution of the number of citation statements per paper by type

Another interesting metric is the prevalence of citation statements in each paper. We know from past research (and intuition) that the total number of citing papers increases over time for a given cited paper. From the graphs above, we have seen that all types of citation statements increase in the year after the Nobel Prize is awarded — however, any citing paper can contain multiple citation statements of any type. For example, the average citing paper may cite a paper before the author of the cited paper wins the Nobel by mentioning it twice; for cited papers published after the Nobel prize is given, perhaps the average citing paper cites a paper by providing 2 mentioning citation statements and 1 supporting citation statements. Let’s have a look at citation statements per paper, again notice the very different y axes:

Supporting citation statements per citing paper are relatively low: on average, there is only about 1 supporting citation statement per 10–12 citing papers (in other words, about 0.10 supporting citation statements per paper). This prevalence of supporting citation statements shows an upward trend; after the Nobel is awarded, there is a sharp drop in the number of supporting statements per citing paper, but the trend then continues upwards.

The prevalence of contrasting citation statements is much lower than supporting, never exceeding about 1 contrasting citation statement per 80 citing papers. It also shows a similar trend as supporting statements: a sharp decrease in the year following the authors of the cited paper receiving the Nobel Prize, and then a continuation of the upward trend.

The number of mentioning citation statements per citing paper seems to remain relatively stable over time, but note how in the year that the prize is awarded, it reaches a local maximum compared to the preceding and following 3–4 years. For some reason, in the fifth year there is a large increase in mentioning citation statements per citing paper.

Two Year SI overall

The Two Year SI, when aggregated for all prize years, remains quite stable, never surpassing 0.925 and never falling below 0.880. The average Two Year SI in the 10 years before the prize is 0.907, whereas in the 10 years after the prize, the average is 0.904. A small difference, but this 20 year span represents a sum of 55,000 supporting and contrasting citation statements: small differences in SI can be quite significant.

Conclusion

Li et al (2018) have introduced a nearly-exhaustive dataset of the full publication records of Nobel laureates. By using this data alongside scite’s Smart Citation data, we can gain a deeper, more comprehensive understanding not only of the number of times that Nobel laureates’ work is cited, but precisely how it is cited by other researchers. Are they supporting, contrasting, or only mentioning a Nobel laureate’s work — and how does this data evolve over time?

Here, I have explored only data for the Nobel Prize in Medicine — this type of analysis can be expanded to other fields as well, such as chemistry and physics. As citation practices vary greatly across different fields, the number of citation statements and their evolution may also vary a lot.

Below is one graph that represents, by field, the share of all citing papers that have at least one supporting citation statement to Nobel Prize winners’ work. From this graph, we can see clearly how some fields such as physics may always have fewer supporting papers than a field like medicine: whether this is from the nature of the field, citation practices, or any other reasons, it is important to keep in mind these differences when trying to compare different disciplines.