Library of Congress Holdings by LCC

View the visualization

Image for post
Image for post

A couple years ago I converted the LCC classification outlines in PDF format to JSON. The goal was to have a way to coarsely place a resource into a LCC category. With the Library of Congress data release I wanted to try it with their holdings to see the shape of the collection.

The LCC consists of 21 classes which narrows topics into increasingly specific categories. This means the system is hierarchical, I wanted to show that hierarchy and which parts of it held the most resources. I used the Book, Serial, Music, Map and Visual Materials MARC records. But not all records had LCC:

Total MARC records: 12,438,797
Total records w/ LCC: 11,870,343
Total records w/ no subfield 050: 568,454
----------
Total records w/ LCC that fit LCC hierarchy: 10,528,234
(represented in this viz)

The records with 050 populated but not a valid LCC were things like minimal level cataloging records or shelf mark identifiers like “Microfilm”.

The visualization uses a force network to layout the hierarchy, but that is not to imply the use of network analysis, this is more like a cluster map or flowchart. It’s good at highlighting heavily collected areas and kind of a fun way to explore LCC.

View the visualization

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store