At scale, at large: Wikipedia & Libraries

Phoebe Ayers
9 min readAug 20, 2016

One of the great organizing principles of libraries is that they collaborate: they work together on sharing collections, expertise, best practices. As a result, library organizations have formed — umbrellas that provide libraries and librarians with shared resources, ideas, and ways to meet each other.

Wikipedia — or rather, the universe of Wikimedia projects — also serves as a loosely federated system, both online and in real life. In real life, there are 41 in-person chapters, each set up to serve a geographic area; there are thematic organizations meant to serve a particular type of content (Wiki Edu Foundation, Wiki Project Med); and there are dozens and dozens of usergroups that do everything from organizing meetups to running photography contests. Online, Wikipedia is also federated: under the umbrella of a single website run by the Wikimedia Foundation, there are hundreds of pockets of people who work on various aspects of keeping the great lumbering 40-million page encyclopedia rolling. There are people who review articles, who write bots, who fix citations, who translate, who welcome new editors, who develop policies, and everything in between.

However, very little of this in-person and online work is centrally coordinated, which is why “collaborating with Wikipedia” is a challenge — you have to identify what exactly you might want to do and what the best structure for doing it is, then identify the other people working in that particular space to collaborate with, then boldly do what you want to do. Coordination arises organically — a page is set up, people sign on to it, ideas are hashed out on mailing lists, and initiatives are begun, documented, taken up by others, refactored, forgotten and taken up again. One problem is, given its great scale, all of this is hard to find. Wikipedia is open to everyone, but it is far from transparent in its inner workings.

But it is (as I’ve argued) crucial for libraries to work with Wikipedia — it represents information access for great swaths of people, and collaboratively represents reach the likes of which we’ve never seen before — which is why this week’s workshop with members and directors of ARL libraries and librarian-Wikipedians was so exciting.

How do some of the biggest research libraries in the world collaborate with Wikipedia? How do librarians and libraries explain Wikipedia and contribute to it productively, how do they improve the biggest reference source to ever exist? Talking about these questions in a room full of dedicated people who have worked in both libraries and Wikipedia felt like a dream come true for me, as I’ve been pushing for so long in so many forums for librarian-Wikipedian cooperation. By the end it felt like we’d identified a few challenges, themes and paths forward:

  • Mutual understanding of workflows and culture is a challenge. People who have never worked on Wikipedia or a similar large, open source project look for a point of central coordination that doesn’t exist. They might be turned off or hurt by by people’s blunt, critical, text-only discussion style. They rightly question how Wikipedia works and the often slightly odd norms that have evolved: then feel like those who do know the system and try to explain it are being defensive and unreceptive to criticism (because there is so so much to explain). They might be confused by or turned off by the explicit valuation of only certain types of expertise: a lot of experience inside the system (eg, edit count) gets you recognized in Wikipedia, but real-world experience is explicitly, deliberately ignored for both philosophical and practical reasons. Conversely, people who work on Wikipedia a great deal but aren’t librarians often do have many of the skills of librarians (a hard-core Wikipedian spends their day researching, and is as critical and knowledgeable of sources and copyright as anyone I know) but probably not the formal training. Wikipedians who want to work with libraries may not have a full grasp of the kind of careful scaffolding that librarians tend towards in their projects, the endless committee work that leads both to bureaucracy but also a stable product, because librarians know that we are in this for the long haul, and systems need to be set up that are stable for the public and for the library’s long-term mission. They might miss the orientation towards public service, the understanding and desire to influence the entire world of information generation (not just Wikipedia), and they don’t have the shared formal and informal education and common professional organizations that librarians take for granted.
  • And this culture clash is exacerbated because explaining how Wikipedia works, or how libraries work, can take a lifetime and still be inconclusive — we are loosely federated, there are many different kinds of institutions and work, and people don’t look at the same data and agree. Both Wikipedia and libraries are exceptionally complicated and hard to define or pin down. (As my colleagues no doubt realized with dismay, I could talk for the rest of my life about how Wikipedia works and not be finished). The only real way to go forward is simply to accept that there’s not a single point of contact, accept that the system is open for new initiatives, and to try to identify the things that you want to do and where to do them.
  • Working in an open project — online, with many strangers from around the internet who you do not know and who may not understand where you are coming from — is not the same as working for a single organization. Beyond simple culture clash in discussion, Wikipedia, like the rest of the open internet, is not immune from people harassing other participants, and also not immune from heated controversy around both topic matter and internal procedures. Though the project as a whole espouses inclusivity and being kind to newcomers, there are plenty of participants who are neither inclusive nor kind — just as there are plenty who are. And so each contributor’s experience on Wikipedia may be different, depending on who else they encounter and what they work on. Women in particular may feel more vulnerable to harassment. Part of what we discussed at the workshop is what it means to ask library staff to work in such an environment (and whether librarians could be a positive force for changing this environment); for Wikipedians who are working with GLAM institutions everywhere, I think this is an aspect we can work on conveying and offering guidance on. It’s a truly hard problem, and not one that any open platform has really solved; but being clear-eyed about what is likely and what is possible for participants to encounter (and how best to deal with it) is important.
  • Training matters. Because of all of the above, training for librarians and library staff in Wikipedia (and perhaps also, training for forward-facing Wikipedians in how libraries work) is imperative. We discussed the various ways that such training could work: sponsored by a central library organization, peer-to-peer, or via a network of fellows and residents (as has already been done in many institutions). And we discussed working with other library organizations and initiatives as a way to reach even larger numbers. For instance, Merrilee Proffitt, Sharon Streams and collaborators at OCLC and Webjunction are already working on a training aimed at public librarians, funded by a recent Knight Foundation Grant.
  • There was also talk of building a cohort of Wikipedians-in-residence at similar institutions (and bringing them together to share ideas and training), which is an idea I love. An idea of my own: let’s go further and build in even more funded fellowships for new librarians and postdocs in our institutions, as we do in many other areas, and have them work on open content generally — identifying the ways that institutions can open up their resources of all kinds to the world, on projects including but not limited to Wikipedia.
  • Reaching out: training is not just internal but must be external too. Academic libraries see themselves as both the gateway and holder of information for the institution, but also as playing a central role in education. Given that, we should systematically improve how we train our patrons — our students, faculty and staff — in working with Wikipedia. The best idea on this front was to collaborate more with existing Wikipedia education initiatives, including the Wiki Education Foundation, to bring more information literacy and pedagogy that fits with the instruction that librarians do (from the one-shot to the libguide to the embedded model) into the mix.
  • Data matters too: libraries and library organizations steward a vast amount of data about the world, from data sets we curate to the bibliographic data that is our bread and butter, listings of books and articles and other material. Wikipedia and Wikidata, in comparison, is just getting started, and there are more citation and metadata problems than you can imagine (or at least than I can imagine without getting a stiff drink). There are already many projects large and small, on Wikidata and off, to work on this. Let’s team up.
  • Working on articles — for heaven’s sake, can we just improve the articles about libraries on Wikipedia? (This idea from the brilliant Merrilee Proffitt). Other professional organizations, from psychology to microbiologists, have done concerted drives to improve content in their areas, and they’ve done it through a fairly simple formula — they have held trainings at professional meetings, sent out a call for volunteers through professional organization channels, perhaps gotten volunteers or Wikipedians in residence to help out, and set up an on-wiki project. We can absolutely do the same, and the best part is the project is already set up: WikiProject Libraries needs some love.
  • There was also a good deal of talk about focusing editing initiatives on diversity and inclusion, including articles about underrepresented groups and minorities (for instance, ideas included an editing campaign around First Nations people, a woefully under-written about area). This focus is also related and tied to Wikipedia’s gap in both gender and underrepresented minority contributors; libraries can help bring more people in.
  • Openness everywhere: Wikipedia rests on a fragile framework of technology, community and policy. And to make Wikipedia really the great information source it aspires to be, we need open information all the way through the publishing stack, from textbooks to specialized encyclopedias to scientific research to public data — it is not enough to just rely on the same closed publishing models, an idea Megan Wacha brought to the fore. In libraries, we need to recognize that open projects generally deserve our attention and support — not all collections are suited for Wikipedia, but they may be suited for something like DPLA, Europeana or another framework (and there does need to be another framework for institutions not in the US or Europe — this is one of the next great digitization challenges). In my opinion, openness everywhere, and the associated policy and legal challenges, need to be front and center in the minds of large research libraries, and Wikipedia helps teach us why it’s important. We all need to organize to preempt bad laws, and to advocate for users when it comes to the next great copyright act and beyond. It’s a natural partnership; library organizations have long experience and moral clout, Wikipedia has hundreds of million readers that depend on it and who can be brought to care about these issues.
  • Do organizations matter, and how should we best organize to do this work? I feel like this was an unspoken sub-theme of the meeting, and it came to life in discussions about what kind of an initiative this is and how to fund it, from the lightweight model (see above) to the heavy-weight (Wiki Project Med’s specialized nonprofit). Most things, including organizations, are better off being lightweight to begin with, but more in-person gatherings, followup, information sharing, and a dedicated person to help organize the information about the many, many projects that are already happening is crucial. There are hundreds of inspiring projects already going on, and it’s not that there’s not a place to find them — there are simply too many places. Coordination and documentation is work that’s often either overlooked or simply not done by busy volunteers who want to get on with the work of holding the event or writing the articles, so this is surely an area where organization can help (and where we can learn lessons from other organizations, where having just a single coordinator can serve as a bottleneck or point of failure). But shared up to date documentation is crucial so new libraries coming in can have a place to find ideas, inspiration and peers.

So to recap: we need shared librarian training, educational coordination initiatives, imaginative linked data projects to improve everyone’s metadata quality, improving our area of professional expertise on Wikipedia itself (as a testbed and an end in itself), working together on open advocacy and figuring out our shared policy challenges, and setting up a lightweight coordination framework to keep those doing this work in touch.

There are areas I wished we’d talked about more, public policy and multilingualism among them, but this is not too bad for a day and a half, I’d say!

--

--