On Citation Fundamentals

Definition, roles, and traits of citation

Sanghyun Baek
Pluto Labs
10 min readJan 30, 2019

--

[Pluto Series] #0 — Academia, Structurally Fxxked Up
[Pluto Series] #1 — Research, the Knowledge Creating Industry
[Pluto Series] #2 — Academia, Publishing, and Scholarly Communication
[Pluto Series] #3 — Publish, but really Perish?
[Pluto Series] #4 — Publish or Perish, and Lost in Vain
[Pluto Series] #5 — On Where they Publish
[Pluto Series] #6 — On Number of Publications
[Pluto Series] #7 — On Citation Fundamentals
[Pluto Series] #8 — On Citing Practices
[Pluto Series] #9 — On Tracking Citations
[Pluto Series] #10 — On Peer Reviews
[Pluto Series] #11 — Ending the Series

Up to latest posts, we’ve discussed the caveats and concerns on the way researchers are evaluated based on their number of publications and where they are published. Points were addressed in that, they are too much simplistic to represent the complex and diverse values of scientific knowledge, the way they are counted (i.e. their dependency on the index used) is questionable, and they may lead to undesired, unethical consequences.

Quite similar points can be addressed for the topic of this post, the citations, as they also depend on the index used, cannot embrace the complex values of knowledge, and may lead to undesired effects. However, a series of posts including this one will deal with the topic more in-depth for some reasons. First of all, citation has been used as a sort of “ultimate answer” to measuring the impact of scientific publications for decades in relevant fields such as library science and similar disciplines, and the most significant reason for that, I suspect, is because the society yet know of no better alternative to it, not to mention that it’s easy to measure.

Secondly, citation sets the foundation to the discussion of previous topics. Metrics such as impact factors, h-indexes, or their tweaks are all driven from the analysis of citation networks. While the discourse around the number of publications by an individual researcher can be straightforward, that around citation may lead to more insights into other metrics. Last but not least, citation has its essential roles in the social system of science, which will be discussed in this post.

The series of posts on Citation will be split into three parts. This first one will more or less focus on the basic part. The definitions, roles, and some traits of citations will be discussed. The second one will discuss how citations should be practiced, specifically on the perspectives of “citing publications”. The third post will deal with considerations around the perspectives of the citation as evaluation criteria (i.e. cited publications side). The later two posts were separated in such manner since citation is a bilateral relation: a publication cites another publication.

“Wikipedian protestor” asking for citation, source: XKCD

What is a Citation

According to Plagiarism.org, a citation is,

the way you tell your readers that certain material in your work came from another source. It also gives your readers the information necessary to find that source again

Citationmachine.net describes it in a very similar way as,

how you let your readers know that you used information from outside sources in your work. It also describes those sources, and provides information that allows the reader to track them down

Or from Wikipedia, a (scientific) citation is

providing detailed reference in a scientific publication (…) to previous published (or occasionally private) communications which have a bearing on the subject of the new publication

whereby it also describes a (bibliographic) reference as

a piece of information provided in a footnote or bibliography of a written work (…) specifying the written work of another person used in the creation of that text

Summing these up, I would describe a citation as,

a piece of information given within a written work A, specifying the details of another written work B, denoting that some information from B is used in A

In such case, we would say that “A cites B” or that “B is cited by A”. Thus, I would like to call them, at least in these three serial posts, “citing publication” and “cited publication” respectively for A and B in such relation. And for better communication, when a publication is given let us call the citations from this publication to its references as “outlinks” (i.e. the citations generated by it as “citing publication”). And call the citations received by this publication from others as “inlinks” (i.e. the citations it received as “cited publication”).

Wikipedia would also explain that “[m]ore precisely, a citation is an abbreviated alphanumeric expression embedded in the body of an intellectual work that denotes an entry in the bibliographic references section of the work for the purpose of acknowledging the relevance of the works of others to the topic of discussion at the spot where the citation appears. Generally, the combination of both the in-body citation and the bibliographic entry constitutes what is commonly thought of as a citation (whereas bibliographic entries by themselves are not).” However, for the purpose of this series, let’s just think of citation as the relation between the citing and cited publications. When referring to those ideas from Wikipedia’s “precise” explanation, I will explicitly write “in-text citations” and “bibliographic references” respectively.

There is much more information available about citation, such as its history and origin, its different styles from various journals, and so forth. As this series is about how it is practiced currently and what we can do better about it, let’s just get directly to what it’s supposed to do.

Why do we Cite?

The definitions pretty much explain the roles of citations. Among them, I notice two specific words: “source” and “find(track)”. The Wikipedia page for (scientific) citation gives more clues, saying “[t]he purpose of citations in original work is to allow readers of the paper to refer to cited work to assist them in judging the new work, source background information vital for future development, and acknowledge the contributions of earlier workers.”

Mixing them together, three essential roles can be listed.

  • Giving credits to the original sources of information
  • Providing evidence upon which the work can be scrutinized
  • Building “search tools” whereby it sets paths to more information

The first point, giving credits to original works, to some extent shares the core concept of this blog series. The citation itself gives credits by attributing to a source of information. The current academic system further uses these citations as a proxy of impact for individual publications. Discussions around this point will be addressed in the third part of this topic. Denoting the functions of citations as academic credits and impact proxy, many literature in diverse disciplines have expressed citation as “the currency of science.”

The second point, providing evidence for scrutiny, is in line with one of the Mertonian Norms, “Organized Skepticism”. In a scientific work, any given information shouldn’t be accepted as it is provided, be it logical arguments or hypotheses, setups and protocols of experiments, design of study, or any aspect of the research underlying the publication. The community of experts will, thus any readers should, critically investigate its evidence and validate accordingly, before taking them as they are. As such, giving evidence with citations is one of the most essential features of the scholarly communication system. This point holds much common with “peer reviews”, which will have its own post later.

The last, the search tool perspective of citations*, is often described with “citation indexes” or “citation databases”. The direct path built from a publication of interest to an ancestral “classic” of the field by following the citations between them (i.e. citations always lead to past), touches many invaluable publications that might be relevant to the readers. This is obviously a practical method when academics search for relevant works. In a macro-scale, citations and the publications on their endpoints, when aggregated, form a great network (i.e. citation network, citation graph, citation index, etc.). This network sets the foundation to many search engines used by academics nowadays, and the recommendation systems in those services are more often than not powered by analyzing this network.
(*Lipetz has described in his 1965 work that embracing more contexts in citations would lead to “more powerful searching tool”)

Occasionally, more roles beyond those are suggested for citations that they i) are proof that the author has investigated and comprehended the references, ii) prevents potential plagiarism by authors, iii) build trust by readers in the publication, and so forth. I wouldn’t include these as essential roles not only because they’re out of the scope of this series, but also because they’re more or less implicit in those three essential roles or are results of them.

Some literature studies the categorization of citation roles, or their “manifest roles.” Peritz (1983), for instance, categorized citations from empirical studies in Social Science into 8 different roles, in terms of what they do “contextually”. These categorizations are indeed very important in citation studies. They were not discussed above for essential roles since they are more focused on the specific roles of individual citations within their context (i.e. in-text citation). This contextual aspect of citations will be discussed in the second post of this topic.

The later two posts of this topic, on how citations should be practiced and on what should be considered when they are used for evaluations, will be discussed around these essential roles of citations. When authors cite their references, or editorial boards set their policies, they should be conforming to these three essential roles of citations. When citation counts are used as evaluative metrics, considerations should be given accordingly. Besides the essential roles of citations, the following traits should be as well taken into account.

Traits of Citations

Some of these may be inherent to what citation is, but others may be due to how citations are practiced under current scholarly communication.

Citation is bilateral

A citation always has two endpoints. On one end there’s “citing publication.” On the other is “cited publication.” As such, a lot of aspects on citations can be understood in two different perspectives.

Citation is dynamic

By nature, citation always looks to the past. On the citing publications’ perspective, citations are deterministic. Once the manuscript goes through publishing processes, gets accepted, and thus published, citations from this publication to its bibliographical references (i.e. outlinks) are determined at the time of its publishing.

On the other hand, the citations received by a specific publication (i.e. inlinks) are dynamic. They are dynamic both in terms of time and in terms of index used to track them. Any publication at the time of publishing would have ZERO inlinks, with some exceptions where publications would cite a manuscript in review. After publishing, it will receive citations from future publications, because citations always look back to the past.

Another thing that makes received citations, or inlinks, dynamic is how we define the corpus of publications that counts as “citing publications”. This corpus, also defining the set of “cited publications”, is often referred to as “citation index” or “citation database”. Depending on which citation index is used, the received citations can vary for the very same publication.

Citation is hardly updated

Similar to the description about outlinks right above, citations once published with their citing publication are hardly changed. There are exceptional cases where the whole list of citations would be deleted when the citing publication is retracted from the journal.

Citation is a simple link

This is obviously not inherent to citation itself, but is due to how it is practiced currently. Currently a citation is merely a simple, ordered pair of the citing publication and the cited publication, no more no less. That is, citations currently do not embrace any specific information except those required for unique identification of the two endpoint publications. Specifically, citation has no information that describes the relation itself. The most we can get about citations often lays in the in-text citations, where we still need i) access* to the full-text of the publication, ii) contextual understanding of the study in most cases, and possibly iii) a lot of human intervention to codify** them.
(* Removing requirement for access to original publication was described as “separable” citations, and codified citations as “structured”, by Initiative for Open Citations, or I4OC.)

Citation spans multiple players

When we speak of citation and its practices, the discourse extends to various players in academic ecosystem.

  • Authors when they write and submit their manuscript
  • Publishers, journals, and their editors in setting their policy about citations
  • Indexes when they aggregate these citation data
  • As with any evaluative metric, funding agents and institutes when they evaluate academics with citation analysis

Blaise Cronin (1984) has categorized four main stakeholders of citations noting that they need to better understand what citation is under “the commercialization of the citation”, which includes:

  • those who generate (authors),
  • those who use (other researchers),
  • those who process and package (info. industry), and
  • those who mediate and deliver (librarians and info. sci.)

Citation can’t be automated

Not all aspects of citation practices can be explicitly expressed and thus practiced as such. They encompass a lot of contextual and implicit judgements, and in many cases their conformity to those yet contextual and implicit norms are sustained by scrutiny of community, i.e. peer reviews. There may even be debatable points, and different disciplines may have different norms. If all these dynamics and traits of citations could be explicitly expressed and thus checked the conformity without social scrutiny, then science would possibly be conducted by machines, without human interventions.

How to Cite BETTER?

With these definitions, roles and traits of citations in consideration, the upcoming post will discuss how citations could be practiced better. The discussions will specifically focus on the “outlinks” perspectives. In other words, some potential improvements will be addressed regarding the way citations are generated by citing publications. As always, please CLAP, SHARE, & COMMENT to the story for more discussions and ideas.

[Pluto Series] #0 — Academia, Structurally Fxxked Up
[Pluto Series] #1 — Research, the Knowledge Creating Industry
[Pluto Series] #2 — Academia, Publishing, and Scholarly Communication
[Pluto Series] #3 — Publish, but really Perish?
[Pluto Series] #4 — Publish or Perish, and Lost in Vain
[Pluto Series] #5 — On Where they Publish
[Pluto Series] #6 — On Number of Publications
[Pluto Series] #7 — On Citation Fundamentals
[Pluto Series] #8 — On Citing Practices
[Pluto Series] #9 — On Tracking Citations
[Pluto Series] #10 — On Peer Reviews
[Pluto Series] #11 — Ending the Series

Pluto Network
Homepage / Github / Facebook / Twitter / Telegram / Medium
Scinapse: Academic search engine
Email: team@pluto.network

--

--