Guide to Litmaps Visualisations

Hamish
Litmaps
Published in
8 min readJun 22, 2021

We’ve added some new tools to Litmaps for visualising scientific literature.

Brief video covering all our visualisation options.

Here are the highlights: You can now sort nodes by citation count or log citation count on the y axis, and use either or both axes to group papers according to how similar their subject matter is.

This rest of this post will discuss a few map settings which will be useful for different use cases, followed by a detailed description of each of the visualisation settings.

Use Cases

To whet your appetite, here are a few ways you can use the map settings to suit different scenarios.

Compact Mode

If you want to understand the chronology of papers building on one another, you will want to set both the x-axis and y-axis to “compact” mode.

This will arrange the papers to make the most of the available space to clearly show the order in which papers were published and the references between papers in your map.

This is the default mode of Litmaps.

Title Similarity

If you want to understand the different camps of research which exist within your map, you will want to set the x-axis and y-axis to “title similarity”.

Papers which are close in the semantic content of their titles, will appear close together in the map. This lets you quickly sort the content of your map into meaningful categories.

Citations vs Time

To understand the growth in publications over time, the distribution of citation counts, and trends in citation growth over time, you will want to set the x-axis to “publication date” and the y-axis to “citations” or “citations (log)”.

This will also help you to notice superstar papers with high citation counts.

Deep Dive into all our options

The remainder of this blog post gives a details for all the map settings.

X-Axis Settings

The first setting controls how papers are arranged along the x-axis.

Compact

Papers are evenly spaced from left to right, ordered by publication date. When one paper cites another, the citing paper will be the one on the left, and the cited paper on the right.

Both the x and y axes have a compact mode and in both cases it is the default mode.

You will want to use this mode if:

  • You want a clear visualisation which uses the space available.
  • You are interested in the order in which papers were published, but not the timing.

Title Similarity

In title similarity mode, papers with titles on similar topics will be spatially grouped together. The distance between two papers will roughly correspond to how different the papers’ subject matters are. You should thus be able to quickly pick out the different disciplines in your map based on which papers are clustered together.

For those interested in the technical details: semantic embeddings are computed for each paper’s title using Allen AI’s SPECTER model (Cohan, et al. 2020). These ~600 dimensional vectors are then reduced to 1 or 2 dimensions using UMAP. If both the x and y axes are in “title similarity” mode, then the 2 dimensional vectors are used. If only one axis is in this mode then the 1 dimensional vectors are used.

There may be a delay in computing the SPECTER embeddings. While these are being computed, the visualisation will default to compact mode.

You will want to use this mode if:

  • You want divide the contents of your map into meaningful categories.
  • You want to understand which papers are similar to each other.

Publication Date

A linear arrangement of papers according to their publication date, with older papers on the left.

You will want to use this mode if:

  • You want to understand the chronological development of a discipline.
  • You want to see how the rate of publications has changed over time.

Y-Axis Settings

The second setting controls (wait for it…) how papers are arranged on the y axis.

Compact

Similarly to the x axis, in compact mode the y axis doesn’t represent any particular quantity. It’s designed to simply show the contents of your map as clearly as possible using the available space.

We wrote an algorithm which tries to find the best balance between three objectives:

  • Spread papers out over the y-axis to avoid label collisions.
  • When multiple maps are visible, give papers from the same map similar y values.
  • If one paper cites another, give them both similar y values. In particular, try to minimise the total length of all citation lines on the visualisation.

And of course, the arrangement should be balanced and aesthetically pleasing!

You will want to use this mode if:

  • You want to clearly see the citation connections between papers.
  • You want to see as many labels on your map as possible.

Title Similarity

Same as for the x axis.

Citations

A linear arrangement of papers by citation count, with more highly cited papers higher up.

You will want to use this mode if:

  • You want to prioritise examining papers with high citation counts.
  • You want to see the distribution of citation counts for your map.

Citations (Log)

Citation counts are power-law distributed, so you’ll often find that one paper with one hundred times as many citations as anything else. When this happens, you will want to display the log of citation counts.

Labels

This setting controls the content of the labels given to each paper in the visualisation, and to some degree the labels’ behaviour.

Keyword

A keyword from the title. Specifically, the first word from the title which is not a stop word. Stop words are common, uninformative words such as “the”, “about”, “for”, etc.

You will want to use this mode if:

  • You want a quick reminder of what each paper in your visualisation is about.
  • You want to understand the topics covered in a region of your visualisation. For example, if your axes are in title similarity mode and you want to know what the papers in one cluster have in common.

Author Year

The surname of the first author and the year of publication, such as “Smith 2012”. Similar to APA in-text citation format “(Smith, 2012)”.

You will want to use this mode if:

  • You are familiar with the literature on your visualisation and are used to seeing papers represented in APA format.
  • Your goal is to learn who the key researchers in an area are, so you want to emphasise authors in your visusalisation.
  • You want to understand the timeline of an area of research, so you want to emphasise publication years in your visualisation.

Author Year (All)

Same as author year, except all labels are shown. In all other label modes, if two labels overlap then only the label for the paper with more citations will be displayed.

You will want to use this mode if:

  • You want to use a visualisation in a publication, so need a complete set of labels.

Title

A snippet of the title of each paper is shown. Because titles are long, typically only a small fraction of these labels will fit within the visualisation.

You will want to use this mode if:

  • You have few papers in your visualisation and want to convey as much information in your labels as you can.

Compact

This is a condensed form of the “author year” labels. Only the first three letters of the first author’s surname are displayed, along with the last two digits of the publication year. So “Smith 2012” becomes “Smi12”.

You will want to use this mode if:

  • You have many papers in your visualisation and are familiar enough with the literature for these terse labels to be meaningful.

Off

No labels are shown.

You will want to use this mode if:

  • You are interested in the aggregate properties of the papers in your map, so don’t aren’t concerned about what each individual circle represents. For example, if you are examining the distribution of citation counts, or the distribution over time, or the graph-theoretic properties of the citation network.

Node size

If set to “citations” then papers with more citations will have larger circles. Specifically, the radius of the circle is proportional to the log of the citation count. If set to “constant” then all papers’ circles are the same size.

You will want to set node size to “citations” if:

  • You want to emphasise papers with higher citation counts in your visualisation. Because it’s hard to estimate the area of a circle, it will be difficult to read the exact citation count values or even the exact ordering of papers via node size. If precision is required, we recommend displaying displaying citation counts on the y axis.

Citations

When the citations setting is “show”, citations are displayed as lines connecting the citing paper and the cited paper.

If your visualisation has too many lines and looks messy, you may want to set this to “hide”.

If the y axis is in compact mode then the arrangement of papers will change depending on whether citations are showing. This is because the compact mode algorithm tries to minimise the total length of all the citation lines visible on the visualisation.

You will want to show citations if:

  • You want to understand how different papers relate to one another by looking at who cites whom.

Conclusion

With all these settings, you can now visualise your literature maps in 6 × 2 × 2 × 4 × 3 = 288 different ways. We hope you find these features fun and valuable.

If you have any more visualisation modes you’d like to see, please get in touch.

References

Cohan, et al. “SPECTER: Document-Level Representation Learning Using Citation-Informed Transformers.” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2020, pp. 2270–82. DOI.org (Crossref), doi:10.18653/v1/2020.acl-main.207.

--

--