Crowdsourcing in the 21st Century Library, Museum and Archive

5 min readNov 18, 2015

Early this month, the New York Public Library unveiled a new project called Emigrant City, built around the premise that important currents of New York City history are buried in a trove of bond and mortgage records from The Emigrant Savings Bank during the years 1841–1933.

There’s a unique twist to this project, however. In order to make sense of these newly digitized collections, the Library needs help from the public. They are using a microsite to solicit citizen volunteers to provide identification, transcription, tagging and more of this vast trove of data.

Inviting regular people to participate in a significant project like this certainly represents an innovative move for this storied institution — and it is a prime example of how institutions like this one can use “crowdsourcing” to improve the quality of their collections.

“Crowdsourcing” is defined by Merriam Websters as “obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers.”

As the social web knits diverse populations together, public institutions are increasingly turning to the public to solicit their ideas and contributions on any number of different projects.

The act of “crowdsourcing” in museums, libraries and archives therefore represents a unique opportunity for some of these venerable institutions to evolve their missions into dynamic digital form and in doing so, build new relationships with the public.

Picture by Senior Airman Joshua Strang, via Flickr

How Do Libraries, Museums and Archives use Crowdsourcing?

Contemporary social media has advanced the use of “crowdsourcing” as a means of soliciting public opinion and comment on subjects that range from conflicts abroad to new types of Oreo cookies.

In libraries, museums and archives, crowdsourcing can take a number of diverse forms including:

Mapping

Aurorasaurus.org uses Twitter to map sightings of the aurora borealis. Led by Liz MacDonald of NASA, the project was formed when scientists studying the phenomenon noticed clusters of sighting on Twitter as far south as Alabama just after a strong geomagnetic storm in 2011. Using the work of citizen scientists and even some non participants, Aurorasaurus makes it easier to track the appearance of the Northern Lights.

Transcription

Emigrant City is not the The New York Public Library’s foray into collaborative transcription, only it’s latest. An earlier project, “What’s On the Menu?”, asked users to transcribe historical restaurant menus. The product of this crowdsourced transcription has become data files that anyone can use for a diverse array of projects.

Identification

The “Zooniverse” is a network of collaborative volunteer research projects, including “Wildebeest Watch,” which solicits users to make identifications of the movement patterns of wildebeest.

Contribution

Part of the 9/11 Memorial Museum’s collection includes more than 1,990 oral histories recorded in collaboration with StoryCorps. These testimonies from regular folk who were affected by 9/11 help to create a dynamic dimension to the memorial site.

The Value of Open Data

For institutions like the New York Public Library, the value of crowdsourcing is simple: citizen volunteers can help to make the processing of large amounts of data more manageable.

Many institutions are also finding that there in tremendous value in opening up the data they compile back to the public.

Trevor Muñoz, Associate Director of MITH and Assistant Dean for Digital Humanities Research at the University of Maryland Libraries, has used data compiled from the “What’s on the Menu?” project to teach students about digital humanities data curation, and written about his process extensively here, here and here.

He says: “I think that the What’s on the Menu project showed was that that’s all great if you can do something fancy down the line but maybe you can just dump out the database into a series of spreadsheets and let people download them from your website and then people will go off and do things with the data.”

Over at the Cooper Hewitt, Micah Walter explains the value of opening data to the public and potential developer interests, below:

Fostering a Full Range of Voices

As institutions look to use crowdsourcing to improve their collections, the questions of “who’s participating?” and “what do these participants find notable?” is bound to arise.

Alice Backer started Afrocrowd to deal with the lack of diverse voices on Wikipedia. Why is diverse representation so important as crowdsourcing initiatives? She explains why in the clip below:

A recent edit-a-thon was held in partnership with the Museum of Modern Art, where participants were tasked with boosting the amount and quality of content dedicated to black artists such as Jean-Michel Basquiat.

Afrocrowd further underscores the need to document languages spoken by small and geographically isolated communities, as well as to expand the definition of what is “notable” within the boundaries of projects like Wikipedia.

Making Crowdsourcing a Simple Proposition

In resource-strapped institutions, the mere thought of corralling internal resources for a new technology project can seem daunting. Add thousands of potential volunteer contributors to the mix and you can see why institutions might balk at the prospect.

In 2008, Mary Flanagan was given a grant to develop a technology that can support crowdsourcing in museums. The result was Metadata Games, an open source platform that turns the task of tagging photographs and other collections data into a game for users. Currently, Metadata Games is being used by the British Library, Boston Public Library, The Open Parks Network, Digital Public Library of America, and the American Antiquarian Society, among others.

For many institutions, the desire to crowdsource is an entirely consistent with a dedication to serving the public. Technology is just making the possibilities increasingly more accessible.

The Crowd Consortium for Libraries and Archives is dedicated to uniting leading experts in a conversation about crowdsourcing best practices. For more on the project, including case studies, visit our website at crowdconsortium.org.