Transcript — Artist in the Archive Episode 1: ‘In the House of Avram’

This is a transcript of the first episode of the Artist in the Archive Podcast. The audio for the episode can be found here, and a finding aid with images, etc. is available here.

Jer Thorp: Hello, I’m standing in the basement of the Jefferson building at the Library of Congress, in front of one of the library’s old card catalogs. It stretches literally down an entire hallway. In it are hundreds of thousands of individual cards containing meticulous records about the library’s holdings.

Jer Thorp: If I just pick one here at random, we can see that this particular card is about a book by an author named Kishi Tazuko, and it’s a piece of Japanese fiction. One of the interesting things about all these cards is that they carry a record of their use. In this particular card, I can see a handwritten note, I can see a stamp, and all of these thousands and thousands and thousands of records act as a kind of snapshot of a period of time in the library’s history.

Jer Thorp: In 1968 the library’s ever-growing holding and the burgeoning computer age met head on, and these catalogs were replaced by digital records. Just last year, 25 million of these records were released into the public domain, meaning that you or I, or anyone with spare time and a computer, can explore the library’s vast holdings and at the same time, learn something about it’s history and it’s culture.

Jer Thorp: This podcast isn’t about those 25 million records; it’s about the 164 million objects that the library holds, it’s 217-year history, and the more than 4,000 people who work in its reading rooms, and archives, and preservation labs, and offices. It’s about libraries and librarians, and archives, and information, and art.

Jer Thorp: My name is Jer Thorp, and I’m an artist, and writer, and a teacher living in New York city. Over the last 10 years I’ve been exploring the complicated and constantly changing boundary between data and humans. Some of this exploration has come by making online tools and data visualizations. It’s also come through some slightly, well weirder endeavors. Like staging a series of data performances in the Museum of Modern Art, or installing an immigration-theme data sculpture in the middle of Times Square. Or, traveling to the Gulf of Mexico in a submarine.

Jer Thorp: For the next five months, I’m the innovator-in-residence here at the Library of Congress, which is a pretty great job. But here’s the big secret: I have no idea what I’m doing. But I hope that you’ll all join me as I try to figure it out.

Meg McAleer: My name is Meg McAleer, and my official title is Senior Archive Specialist in the Manuscript Division of the Library of Congress. I brought today, one of my favorite collections. It was a collection I processed many, many years ago, in the 90s. It’s the papers of a woman named Rhoda Metraux, who’s not a tremendously well-known name at all.

Meg McAleer: She was an anthropologist who did some very, very important work on her own, but perhaps she’s better known for her professional partnerships with Margaret Mead. They worked on a number of projects together, including one in 1956–57 where they tried to document how children and adults viewed scientists. This was in the middle of the Cold War, we were concerned about falling behind the Soviets in the studies, so they wanted to get at what the perceptions that Americans had at all age levels, of scientists and scientific work.

Meg McAleer: They were in the middle of this project, and doing lots of interviews around the country, and then all of a sudden on October 4th, 1957, Sputnik was launched.

General Audio: Today, a new moon is in the sky. A 23-inch metal sphere placed in orbit by a Russian rocket. You are hearing the actual signals transmitted by the Earth-circling satellite, one of the great scientific feats of the age.

Meg McAleer: Word got out, the New York Times has this big headline the next day on October 5th, announcing that the Soviets have launched a satellite. Margaret Mead and Rhoda Metraux sent out letters on the 6th, saying to all of their interviewers, “Stop what you’re doing and begin asking people what their reactions are to Sputnik.”

Meg McAleer: We’ve got a lot of material that they collected, a lot of data. It comprises eight archival Hollinger boxes and may number about 25 hundred pieces of information, sheets of paper, which is pretty extraordinary. Most of it is in raw form, so we have drawings by children of Sputnik, how they imagined it, we have little class assignments when they wrote about it, and then we also have narrative interviews by adults.

Meg McAleer: I’m almost kind of charmed by the fact that these are narratives instead of the SurveyMonkey that we’re all kind of used to now, where you select the most appropriate answer. But this is narrative, so you get the voices of people in 1957 of all age groups. Young children, college students, as well as man-on-the street type interviews.

Meg McAleer: The children’s responses are probably the most charming. They drew pictures, and there’s one that I absolutely love. It’s from Jimmy Brooks of Huntsville, Alabama. He was in the 6th grade. He drew this wonderful picture of Sputnik, which shows actually he’d been paying attention to the news reports because it’s pretty accurate. But the outside, he has a lot of dials and instrument panels and it almost looks like Sputnik has a lot of bumper stickers.

Meg McAleer: What I really love is Jimmy’s reaction to Sputnik. He was asked, “What does this mean to you?” And he writes, “It means that I might build rockets to go to the moon. I might build space stations.” Jimmy just absolutely captured what I think is unique about the responses that Margaret Mead and Rhoda Metraux got to this. Yes, there is a lot of Cold War concern on the part of people, but a lot of people said, this is a tremendous human accomplishment. And I think part of the reason they got that reaction was that they were doing these interviews so soon after the launching. Again, before there was that kind of Cold War type of massaging of the issue. So a little bit of the data had actually been quantified. This is great.

Meg McAleer: One of the questions asked was, what was the most important thing that happened in the last three weeks? And there was apparently a lot going on in October 1957. For instance, Queen Elizabeth came, and then also federal troops were called out at Little Rock. There was a flu epidemic. There were numerous developments as there usually are, there were baseball scores, and the stock market fell. And yet of all these different big-news items that happened, the vast, vast majority of people identified the launching of Sputnik as the most important thing that had happened.

Meg McAleer: Then they were asked a very kind of wonderful, open-ended question, “What do you think about the satellite?” And again, only a few of the responses have been tabulated. The largest number of responses had been, “Too bad the Russians got it.” “First, warning for the US to wake up.” There was 92 of 270 responses, that was the most common. But then the next most common is, “Great scientific advance.” So you see how reactions to Sputnik was maybe much more varied than we think it had been. Which is really wonderful.

Meg McAleer: Which is not to say that there wasn’t this kind of immediate, Cold War response to it because the next most common was expression of fear or apprehension. So I’m not negating the fact that this was kind of put in a broader context of the Cold War. But, 65 thought it was a great scientific advance.

Meg McAleer: The narrative responses that came are really fascinating. We can actually hear the voices of people in 1957, and I found myself really fascinated to see how they express things. Is it similar to us? Do they use the same vocabulary? Is it different vocabulary? Are they more thoughtful than we are today? I don’t know, it’s just kind of a fascination, kind of hearing these voices coming from the past.

Meg McAleer: So, the reactions were varied. This woman, who was from Los Angeles, she was a female and 27 years old, said, “I think it is one of the greatest things that ever happened. This is a historically significant occurrence future generations will remember 1957 like we remember 1492.” That wasn’t the only response. One woman also from Los Angeles, who I think was very honest, very thoughtful, and she said, “I wish I could have answered this before the publicity came out on it.” Then she goes on to admit, “Then I got frightened by it because of it’s connections with Russia. I felt like a sitting duck.” So you see the whole gambit of reactions.

Jer Thorp: When you were processing this, what do you think it was about this that has stuck with you for so long? I mean, you probably process a fair amount of things that have the same type of human hand, that you have the same type of … is it a personal connection for you? I’m interested in why this one.

Meg McAleer: One of the things that I think that really causes to have an impact on me, where I’ve remembered this collection all of these decades later, I love the spontaneity of it. I love how quick Margaret Mead and Rhoda Metraux were to get all these interviewers in the field to immediately stop what they were doing and to begin interviewing. In fact, that recognition that this was so incredibly significant and to want to get people’s reactions in the moment was just absolutely incredible.

Meg McAleer: The other thing that really impresses me, is that not much has been done with this data. It is still in its analog form, largely un-analyzed, and just the potential for us to understand how people were reacting to something so big is still there, and it’s full potential has not been unlocked. And this wonderful thing, in working with my collections … by the way my field is 18th century, which means I do not work on 18th century collections, I work on 20th century and 21st century collections.

Meg McAleer: And so granted, this is not my field, but one of the things that has really impressed me working on mid to late 20th century collections is seeing what happens as computers begin to enter our lives. There’s this kind of liminal period, I think, in the mid to late 20th century, when the potential is there but it’s not fully unlocked, and it’s almost like this dance we have with computers and the technology to do something with data. For me, I wonder if anything is ever going to be done with this data that’s not in electronic form. It’s still in analog form, it’s almost like it’s encased in amber. Can we release it? Or, are we going to have the expectation of being able to manipulate data so fast that we’re not going to take the time to go back and try to do something with this? I hope somebody does, at some point. Because the potential I think, is just absolutely amazing.

Jer Thorp: I am sitting in the office of Kate Zwaard who is the chief of the National Digital Initiatives at the Library of Congress. And in that office, is Kate Smart. Hi, Kate!

Kate: Hi!

Jer Thorp: How are you?

Kate: I’m great, how are you?

Jer Thorp: I’m fantastic. I’m really excited to start off the podcast by interviewing you, because you in many ways, are the reason why I’m here, and thus, in this office, and doing this podcast in the first place.

Jer Thorp: So. I did my research, as one does, and I was looking at what the National Digital Initiatives do, there are three things that are outlined on the site. One of them is, to increase the use of the library’s digital resources, right? The other one is to promote the library as an innovator. It’s the third one that got me because I think it’s really interesting and there’s a lot to talk about there, which is to grow the national capacity for cultural memory.

Jer Thorp: What do you think about when you think about that phrase?

Kate: Man, I wish that phrase … I’ve been trying to rewrite that for a long time because I feel like it doesn’t have any meaning.

Jer Thorp: Okay, let’s rewrite it right now.

Kate: But what I think it means is, I think we have a moral obligation as the National Library to be a servant leader to other cultural heritage organizations who are looking to do innovative work. By that I mean, to convene interesting conversations, to participate in standards committees, and I think that that work does not always automatically fit in with the normal production work of a library. It’s a thing that you have to do extra.

Jer Thorp: Yeah, I mean, when people close their eyes and imagine what happens inside of the library, your job might be one of the last ones that they imagine. Here’s this person trying to understand how to innovate digitally. But, there’s a deep history to that at the library, and it makes sense. This library is the largest library in the world. They have the largest collection of anything, really, in the world. You can’t just go about collecting that much stuff without trying to come up with a technical solution to it.

Jer Thorp: That’s really evidenced in the building. When we walked around the building, we’ve been on a couple of great tours together … There are literally card catalogs everywhere! You can’t turn around the corner without seeing a bit of card catalog.

Jer Thorp: I just read an amazing article by Tim Carmody about the historical importance of card catalogs, and he talks about how card catalogs literally changed the way we thought. But he heard there was a transition really from the card catalog probably just because too much became too much. There’s this no impossible to go to make a card catalog big enough.

Jer Thorp: So, in 1968 a woman who I know you’re a big fan of, Henriette Avram, developed a filing system, a classification system which I am not yet a big fan of, I’ll say it right now, called MARC. Do you want to talk a little bit about Henriette?

Kate: Yeah, I do, and I think it’s interesting how, because we’re such a big place, that scale is such an interesting problem here. A lot of the innovations that come out of the Library of Congress come out of it due to scale. I think a good example of this is the BagIt format, and Bagger, where the Library of Congress collaborated with a bunch of other cultural heritage institutions to see if we could make a data structure to transfer large group of files. This was when digital hardware weren’t anything. I think that the work with the MARC standard is very similar. It was an effort to get our hands around scale. To get our hands around all these cataloging cards and maybe think of a more computational way to approach that problem.

Kate: What it means I really like about Henriette Avram, whose work was not that long ago … and there are people who worked with her directly in the building. Beacher Wiggins, who’s our head of cataloging here at the library, worked directly with her. She was not a librarian by training; in fact I have a quote on my wall of hers. “I’m not a librarian by training, but a brainwashed pure systems analyst.” It’s so neat because she brought a whole different fields knowledge to her work here at the library and thought, “I can make this work a little bit easier.” She did all of this, creating a whole new encoding standard for bibliographic information before relational databases, before modern character encodings, so it was really a foundational moment in Computer Science that I think a lot of people don’t know about.

Jer Thorp: I was impressed, and I guess it’s not surprising that she wasn’t a librarian, but she sort of became de facto adopted as a librarian, right? I think there’s actually like … for the Americans, [inaudible 00:17:46] they’re like, normally like in Europe, you can be a librarian, is that … am I imagining that, or that happens?

Kate: That’s true, they gave her the honorary award I think. They honored her as, Lifetime Achievement and some librarianship … it’s interesting that conversation in libraries about who or who isn’t a librarian. So, I don’t have a library degree, and I wouldn’t self-identify as a librarian, but I do think I’ve gone native. Libraries are core to my heart, but, yeah.

Jer Thorp: Let’s talk just briefly about some specific projects that you’ve been working on, because I think it’s really exciting what’s happening with your group. I’m gonna mention the CRL like, 17 different times because I want people to go there, but at we can get access to all these projects, but I think one of your newest projects is the Beyond Words project, which I think grew from an earlier project that you’ve been really involved in, Chronicling America. If you wanna talk a little bit about that, I worked at the New York times for a long time and I’ve been involved in the newspaper world and I didn’t even know about Chronicling America till we met, so maybe you can tell everyone about what that project was and then maybe what you’re doing with Beyond Words.

Kate: I really think Chronicling America is a shining jewel of what a digital library program can be. So it’s a really neat collaboration between the Library of Congress and the National Endowment for the Humanities. NEH gives grants to organizations to digitize their historic newspaper collection. As part of that grant, they have to give us a copy of the digital files. So we’ve put all those digital files up online in, right?

Jer Thorp: We can edit in the right url. Just with a different voice, ‘Chronicling America’!

Kate: So, it’s this searchable database of historic newspapers. One of the neat things I think that happens when you do that, is it enables a certain type of scholarship that wasn’t possible before. For example, I can search on my address in the search field, and find out, find the building permits. A project which would have taken a very long time in the past; you would have to go through microfilm, of newspapers that were happening around that time and it would be very difficult to do, but I can just do it from the comfort of my own home.

Kate: But I think, the other neat thing, is that it has enabled all this digital scholarship and digital humanities research. NEH last year did a data challenge competition where they invited people to do interesting and new things with the data itself. There’s downloads of the text available on the site. We saw new types of scholarships emerge from that, and I think it’s a really good example of this historic partnership between the library and the National Endowment for the Humanities, but also, what’s possible if we make our APIs well structured, if we make our data usable to people.

Kate: The neat thing about the downloads from Chronicling America is that you don’t have to interact with the API who use them. So you can download them, and play with them in Excel, so the ramp up level I think is really low.

Jer Thorp: And then, talk about …. so Beyond Words is a crowdsourcing project which I think we all know, anybody who’s been involved with large sources of data like a computer analyzing things can only get you so far. What is the main purpose for that project and where do you see it going and how does it look?

Kate: If I could talk a little bit about how the program got started, it actually was an earlier version of the … so you’re Innovator in residence here, we’re so thrilled that you’re here, it’s really exciting. And we wanted to pile it on that concept, that could we have someone come, and do something for a short period, and have it have a lot of impact? Would that even work?

Kate: We decided to pilot that out as a staff assignment first. We posted on the Internal Staff website for someone to come to do a short-term project, with Labs, we would pay for it, they would do a three month experience. One of the staff developers applied, was selected, and he was really interested in exploring the idea of crowdsourcing. We poked through a lot of the digital collections, and he was really struck by the newspapers, especially the images and the cartoons, and how valuable the dataset those things were on their own, but it was really hard to find.

Kate: So if you wanted for example, a list of cartoons from all the papers that were published during the World Series, it would take you a long time to do that. But if we could crowdsource some of that, we could end up with a really rich dataset.

Kate: He used this crowd framework which was, I think was a collaboration between NYPL Labs and Zooniverse to create this crowdsourcing project where people could look through historic newspapers and identify pictures. So, cartoons, photographs, other images, and help capture them. Because the OCR around photographs and cartoons is especially bad because it’s not structured in the same way as the rest of the data is.

Kate: He completed that project and people were so … we were gonna launch it as sort of a proof of concept, but people here were so excited about it that we ended up launching it as a pilot application. We’re learning a lot from it, but what I think we’re really interested in is moving to a more robust crowdsourcing project for transcription and tagging.

Kate: What I specifically love about crowdsourcing is it invites people in to the collection in a new way. I think of one of the core constituencies of the library as the informed and curious. I think we do a great job serving researchers and people who have very specific research needs, so documentary filmmakers, or scholars, but I want us to be more welcoming to people who are just wanting to poke around and learn things. I see crowdsourcing projects at other big national libraries or other cultural heritage institutions, and it’s a fun way to just take ten minutes out of your day and learn something new that you would never have picked up before.

Kate: To me, that’s the thing that gets me excited about it. But I also think that developing our ability to search terms in our historic collection is really neat. The other thing that crowdsourcing gets us is readability. You were at the manuscript division, looking through some of their stuff, which I think is really, really cool, lots of cool things. Some of it is kind of hard to read, especially if you’re not looking to decode it, you just kind of want to glance at it and get a sense of what’s on that page, and I’ve recently learned that teenagers can’t read cursive … did you know that? They don’t teach that in schools.

Jer Thorp: I didn’t know that, but it makes sense when … you know, cursive, even if you haven’t read it for a while, it’s like, “Oh, what is this strange glyph?”

Kate: Especially historic handwriting is even further from that, so modern cursive is easier for people to read than historic cursive, so if you want to glance at a page of Alexander Graham Bell’s lab notebooks, or Claire Burton’s diaries, it can be hard for an average person to just take a look at that and see what it says.

Kate: But, if we invite people to do crowdsourcing, one person can do that intellectual labor, and then share it with other people. So it makes the readability of those collections much better too.

Jer Thorp: I was just … my partner and I were just in the Great Hall and we stood behind the strange glass fishbowl to look down in the reading room, and to see real-life researchers researching. I think there is that division you’re describing about, like wrought in glass. Here’s the public over here and you can look at signs and you can see how beautiful the building is, but in there is where the sort of rarefied research happens.

Jer Thorp: I know that anybody can come and get a Readers’ card and come to the library, but this has been something that I’ve been thinking about a lot in my own research, is that digitizing libraries and museums tends to be mostly in the service of the digital humanities. It’s such a boon to researchers who want to dig into this stuff, but to the public, what does it mean? That’s a huge question to me. How do we make that information useful in the same way that books were useful to the public, like say, when you and I were growing up?

Jer Thorp: I would go to the library and what an amazing sort of experience that was, to sort of browse through serendipity through all the books. It feels to me anyways that what we have in the digital world is extremely powerful, but not playful in that way. I’m not sure that anybody’s like, “Oh I’m gonna … you know what I’m gonna so this Saturday? I’m gonna go spend some time in one of the search interface.” Whereas I think at some point in our lives we would say, “I have a free Saturday, I’m going to hang out at the library.”

Kate: I think that divide between regular people and researchers is something that the librarian is really interested in breaking down. So that like, “I’m coming here to visit, but researchers are other people and this collection is for them”, is I think a very real feeling that people have, that we would like to figure out ways to break down that divide.

Kate: I know that … it’s a real problem. We want people to think of this collection as for them. It’s America’s library, your tax dollars pay for it, it really if for you. I was talking to a relative the other day about the Alexander Hamilton papers, and he said, “I’m really excited to see that they’re online, I never thought that I’d be able to take hold of them.” I mentioned that to one of the librarians here and they said, “Well, he can come in to the Reading Room whenever he wants.”

Jer Thorp: And he did.

Kate: But I said, “I don’t know …” I said, “I don’t think my cousin is gonna wander into the Manuscript Division and ask to seek our Alexander Hamilton’s papers. That’s a heavy lift for someone.

Jer Thorp: It makes me nervous, going into a reading room, I feel like I’m gonna be cast out at any moment because I’ve broken some rule. It feels like a very … and also, hey, some people can’t come all the way to the library.

Kate: Right.

Jer Thorp: So, yeah.

Kate: So making people think that the library is for them, I think one of the things that we have done is, Dr. Haydn and our Library Services staff moved the reader registration closer to the Reading Room. So, used to be that you had to go the Medicine building, register, and then go to the main Reading Room. And so I think that is a physical manifestation of breaking down those barriers.

Kate: When we go to libraries, archives, museums, as a family, I think what you see over and over again, is the popularity of places where you can do stuff. People want to do stuff. I think figuring out ways that we can help people do stuff? I think crowdsourcing is an easy win.

Jer Thorp: I think all the sensors are great. Here’s the thing that’s been sticking with me from bubbles out of this conversation … I think when we think about the library, we do think about things like the Hamilton papers. Like these objects that are somehow kind of historically rarefied. But the things that I’ve been loving so much are these little pieces of real … not that Alexander Hamilton wasn’t a real human being, but real humanity.

Jer Thorp: We’ve seen a couple of archives of children’s letters, like one of children’s letters from the 1980s talking about what they imagined the world to be like, and what they scared of, and what their concerns are. The library has a huge collection of that, and certainly the American Folk life Center is in some ways the opposite to the Rare Books Section in that it’s … it is recordings of people talking about their everyday lives, and I wonder if that’s an avenue to get more people feeling like it’s something that they can approach by making the material itself something that feels less intimidating than the Hamilton papers.

Kate: Right, it’s not all just fancy thoughts from fancy people. I think one way is to encourage contributions. The Veterans History Project is a great example of this. They solicit first-person accounts from veterans, and often in an interview setting with a loved one. I think knowing that I have something at the Library of Congress makes it feel more like a place that’s for you.

Kate: I also think about the kinds of primary research that people are actually doing, and engaging with them in that way. So, Chronicling America is a great example about that. The only primary research that I know that most of my family has done is genealogy. So they’re really interested in, “When did Grandpa Ryder come over from Ireland? What was the name of his boat? What was he wearing?” Some of that material is available in historic newspapers. When they’re using it on Chronicling America, or they’re using it on Ancestry or other databases, that material comes from the library.

Kate: Meeting people where they are, where their interests already are, I think is really important.

Jer Thorp: Well, it’s been fantastic to talk to you. I was reading a lot about Henriette before I came in, and Beacher Wiggins comes up quite a lot, I think one of her co-workers who is still at the library, and when he was trying to sum up Henriette he said, “She could best be described as a dynamo.” I think that’s the best way also to describe Kate Swaard, as we’ve been walking around this building, it’s just been a non-stop love fest for you and the work that you’ve been doing here. I don’t know if we’re gonna hear from you personally again in the podcast but I know we’re gonna be hearing from you a lot in the other people we speak to and the things that we ask our listeners and so on and so on. So, thank you so much for being the first.

Kate: Thank you, Jer, that’s so kind of you. I feel so lucky to be here and to have this jewel in my hand. I feel like everything I do is in service to doing it justice. The staff, the collections, and the American public here. So, I’m so thrilled you could be here.

Jer Thorp: Thank you.

Jer Thorp: In 1824, the fledging Library of Congress shelved just 16,000 volumes. Housed in a modest attic room in the north wing of the Capitol. This was by any real measure, a small library, serving a country that didn’t have a whole lot of books.

Jer Thorp: In 1836, Francis National Library contained more than two times the number of books that were held in the entirety of the USA. It did take a while for the library to grow. In 1851 a fire destroyed nearly 35,000 volumes, but the library did grow. Slowly at first, and then, well, really, really fast.

Jer Thorp: In 1861 the Library held 72,000 volumes. By 1913 it house more than two million books and one million other items. In 1936 there were six million books and pamphlets, 1.5 million maps, half a million prints, and a manuscript collection that was only described as, uncounted.

Jer Thorp: By 1945 there were 25 million items. By 1974, 74 million. Today’s Library of Congress is by any measure, immense. Seven million maps occupying three football fields of flat files. 16 million prints and photographs. The holdings of the manuscript department have mostly been counted; there are more than 70 million items.

Jer Thorp: The American Folk life Center has 34,000 field recordings on 10,000 discs, and the Veterans History Project archives more than a 114,000 interviews.

Jer Thorp: The Library’s web archive, not even two decades old, stores over 14 billion pieces of content. Okay, okay, you get it, right? It’s big. As I’ve been starting my research I’ve been trying to find ways to grasp the scale of the archive, while at the same time not losing the connection to its texture.

Jer Thorp: I’m trying to find ways to know the broad topologies of the library, as well as it’s extraordinary nooks and crannies. One way I’ve been doing this is to try and talk with as many people in the library as I can. You’ve heard two of these interviews so far and you’ll hear many more as the podcast progresses.

Jer Thorp: At the same time I’ve been writing some code to test out some ideas and approaches for getting to know the archive a little bit better computationally. Each episode I’ll give you an example of something I’ve been working on, and I’ll point you to some source codes so that you can dig around and do some experimentation yourself.

Jer Thorp: This week’s took is a really simple interface for exploring how author’s names are represented in the library’s collections and how they change over time. The concept is pretty simple: you pick a year, and the took gives you a list of first names that an author from that year might typically have had.

Jer Thorp: It’s worth noting here that these lists of names change every time you load them. They’re not the most popular names from a given year, but instead a statistical look of what names you might end up with if you were to sample books from that year at random.

Jer Thorp: I used first names here because they’re so personal. It’s maybe a bit strange, but the way that I read these lists are as guest lists for an imaginary series of author’s parties held every year. Looking around the room at these parties, you might get an idea of what kinds of people are writing books in that year, or at least the kinds of books that the library might be putting on its shelves.

Jer Thorp: Let’s use this tool to take a guess at who might be at an imaginary party held in the year 1750. Here’s the list: Samuel, James, William, Giuseppe, Frederick, Johan, Friedrich, another William, Charles, and another Giuseppe. Another Charles, okay a third Charles, Jonathan, Joseph, Edward, and Thomas. Seems like a fun crowd, I guess?

Jer Thorp: Oh, let’s use the tool again to generate a list of names attending an author’s party in 2005. Here we go: Roland, Jason, Bernard, Fiona, Louise, Joanne, Margarite, Guyam, Caroline, Peter, Dan, Thomas, Rojer, Robert, Boy, John, Zoltan, Peter, Annette, and way off in the back, Gabriel.

Jer Thorp: Just by generating these lists we can get a sense of how author’s names change over time. It’s not a really exact way to visualize this data, but I think it does a good job of giving us a sense of the change and a sense of the cultural context in which these authors were living and in which these books were published. You can make your own author party list with the names explorer, and remix it online at If the abstractness of this tool is making you a little itchy, you can also seize a more detailed distributions of name frequency over at You can also get all of the source code I used for processing 25 million MARC files at the project giyhub repo at

Jer Thorp: Finally, you also get access to all of my research materials including writings, and photos, and wiki in the projects open science framework page which is over at

Julie: I’m Julie Miller, I’m the curator of Early American Manuscripts at the Library of Congress.

Jer Thorp: Alright, so tell us what you brought.

Julie: This is something that was cataloged incorrectly in the 1920’s as the Massachusets Sheep Census, and when I looked at it, I noticed that it’s essentially an 18th century spreadsheet. It’s a table, and down the left side there are 29 names, and across the top there are lists of things like possessions, and how many people there were in the household, and in the very end, it says “number of grown sheep that were shorn”.

Julie: I wanted to know what it was, and I did some research in various sources including the census, local history, genealogies, and newspapers. I discovered that what I actually is, is a tax list from Canterbury, Connecticut in 1787.

Julie: There’s a couple of really interesting things about it, this is an example of what historians call a quantitative document, in other words, it’s not like a letter or a diary where a person tells a story. It’s simply a list. So you imagine that something like this might be dry, you don’t think that something to do with taxes would be of very great interest. But in reality, there are many, many stories here about who people were, what they owned, what they valued, and what their aspirations were for the future.

Julie: One of the things that I noticed here that’s really interesting is that among the possessions that are listed here, there are some really obvious things. This is a farming town, this is Canterbury, Connecticut, and most of these people were farmers. So when you look at their possessions, they own land, and they own livestock, and it’s apparent that they’re farmers.

Julie: But they’re also asked if they own clocks, and only three of them own clocks, which is interesting because you would think that everyone would own a clock, but in fact in the 18th century, a clock was a kind of a cutting edge technological device and in fact not many people had and that in reality they didn’t have much use for, so it was kind of a status item.

Julie: The other thing has to do with the sheep. What’s going on here is, people are listing what they have for purposes of being taxed. There’s no income tax at this point, so they’re being taxed on what were called their polls, their heads. In other words, people in the household aged from 16 to 70 are being taxed and at different rates.

Julie: But then at the very end, there’s this column about the sheep, and they’re not being taxed on their sheep, and it turned out I discovered in a book that describes the taxes of Connecticut, it turns out that they’re getting a tax break for their sheep. So, why were they getting a tax break for their sheep? Because this is a period when New England was beginning to think about how it might participate in the Industrial Revolution. In other words, how they might look ahead into the future and become a manufacturing region, which they did. They were not there yet, but they were just starting to do that, and the period that we think of as the beginning of manufacturing is really later. It’s the 1820s, this is still 1787.

Julie: There’s a bunch of other things here, too. Only the heads of households are listed, and the people who are listed as polls, in other words, simply numbers, are just numbers, checks in a column. But these are all the other people who lives in the household, and they’re not listed. And we know that there are slaves in this county, Wyndham County in Connecticut. They’re among these checks, as are wives, children … and then there’s three women who are listed. So you ask yourself, what makes them visible, how come they’re visible and all the other women are not? The reason is because they were widows. When women were widowed, they once again came into possession of their property. In other words, when women married, they were subsumed under the identities of their husbands, and their husbands owned everything that they owned. When their husbands died, not a very happy event, but nonetheless they regained their property. So, we have three women who are householders here.

Julie: There’s a lot of stories about this. One of the really interesting things about this, is this tax list, I learned from the … There’s a book that described how taxes are collected in Connecticut that I found to help me to understand explained that they’re collected in the summer. This is the summer of 1787, this is dated 1787. The summer of 1787 was the same time that the Constitutional Convention was being held. One of the things that I find so fascinating about this, is that this is a snapshot of ordinary people at the very moment that the Constitution was being written. This is what their lives were like.

Julie: Further, as you may know the Constitution was ratified by the States in conventions. One of the people who’s on this list was one of the people who was assigned to investigate, in other words it says here, Solomon Payne, his name was, he was one of the largest sheep holders in Canterbury, Connecticut and getting a nice big tax write-off for that, and he was one of the people who was delegated to examine the new form of government made by the Convention in Philadelphia and show to this meeting their arguments and opinions therein. In other words, to inform the delegates who would attend the ratification convention of his views of the constitution that has just been drafted.

Julie: So here we have Solomon Payne with his sheep. He had no clocks, for whatever it’s worth, he was one of the richest men in town, he’s handed in his list of valuables to the lister, the tax lister who created this list. Now it’s in the summer, it was August actually, and here it is November, and he’s gathering information about the Constitutional Convention. It’s a modest little document, but it’s very rich in information about life in a particular time and place, particularly important moment in American history.

Jer Thorp: This podcast was recorded by Jer Thorp and produced by Greta Weber. We’d love to hear from you. If you have questions or ideas or any suggestions at all, please feel free to send me an email:

Jer Thorp: The music you’ve been listening to is a live recording of a concert called Calypso at Midnight. It was part of a concert series run by Alan Lomax, who had discovered that the venue the Town Hall in New York City could be rented cheaply at night. So he ran a series of concerts called the Midnight Special, which was thematically organized around different music types, Blues at Midnight, Ballads at Midnight, etc. This particular concert, the Calypso concert, may be the only existing recording we know of that was found by Lomax’s daughter in a closet. This recording comes courtesy of the Association for Cultural Equity.

Written by

Jer Thorp is an artist, writer & teacher. He is Innovator-in-Residence at the Library of Congress. His book Living in Data will be published in 2020 by MCDxFSG.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store