On Humane Tech: Saving Ukrainian Cultural Heritage Online with guest Quinn Dombrowski.
The Lincoln Center for Applied Ethics hosts a weekly conversation “On Humane Tech,” highlighting relevant news in a conversational format with our team. Each week the topic changes, but one thing stays the same — we want to hear from you. Respond to our conversation below.
This week’s topic: Saving Ukrainian Cultural Heritage Online with guest Quinn Dombrowski.
Our team is honored to speak with Quinn Dombrowski, an Academic Technology Specialist in the Division of Literatures, Cultures, and Languages, and in the Library, at Stanford University. Dombrowski is a part of a recent project titled SUCHO, or Saving Ukrainian Cultural Heritage Online. SUCHO is composed of more than 1,300 cultural heritage professionals — librarians, archivists, researchers, programmers — working together to identify and archive at-risk sites, digital content, and data in Ukrainian cultural heritage institutions while the country is under attack.
Elizabeth Grumbach, Program Manager for Digital Humanities and Research: Thank you for being here, Quinn. Could you talk to us about the work that the Saving Ukrainian Cultural Heritage Online (SUCHO) project is doing and why it’s so important to preserve cultural artifacts?
Quinn Dombrowski: This all started on Twitter, as these things sometimes do. Anna Kijas, who’s the music librarian at Tufts University, posted in the first weekend of the war about wanting to do a data rescue event around Ukrainian music collections. She realized that there are significant music collections in Ukrainian museums and libraries. It’s easy to forget the Internet is physical things: servers, cables, power, cooling. That means that the Internet is as much at risk as any physical object. Servers can be destroyed, power can be knocked out, and these servers may or may not come back once they go down.
Anna was focused on the music collections, because there was a musicology conference that she was attending the following week. But she posted the tweet and I saw it, and Sebastian Majstorovic from the Austrian Center for Digital Humanities and Cultural Heritage also saw it, and had some suggestions of tools we might use. We all wanted to move a little bit more quickly than Anna initially envisioned, because we didn’t really know what the state of things would be a week down the road. And so we and a couple other colleagues met on Monday, and on Tuesday March 1 we launched SUCHO: making the website as we were writing the tutorials, as we were setting up the Slack, as we were putting out the call for volunteers. We had 400 people by the next day, and what those people have been working on ever since has been focused on web archiving for cultural heritage institutions. We use a couple different tools, including the Internet Archive’s Wayback Machine. Our volunteers make sure that the sites that the Wayback Machine have captured include content down to the sub-sub-pages where the actual content actually lives.
But we’re also using the free open source WebRecorder software that lets anyone make web archives from their own computers. And this is really important, because early in the project there was a power outage at the Internet Archive and that took down the Wayback Machine for a whole morning. Usually this wouldn’t be a big problem — go get lunch, have some coffee and get back to this work later, but when you’re in a war you can’t wait on Northern California power companies to get their act together.
By having a distributed approach the work never needs to stop. People are using the command-line Browsertrix crawler. Other people are using browser plugins for more complex interactive sites. There’s also a cloud version that makes it as easy as filling out a web form to archive a site. And that’s what everyone — from retirees to my eight-year-old — have been doing, to be able to help with this effort.
We defined cultural heritage really broadly: not only libraries, archives, and museums, but we’ve expanded over time to include things like children’s after-school programs, and a children’s railroad where teenagers can learn how to be trained engineers with a beautiful 3D tour of the facility. Even things like fan fiction that are ultimately expressions of culture and creativity filtered through international multimedia franchises.
It’s been made clear through statements from the Kremlin that cultural heritage is perhaps at the heart of this war. This war is about who gets to have a country, who gets to have a language, who gets to have a national identity. It’s Russia’s assertion that the Ukrainian people cannot legitimately have those things, and so we want to make sure that what the Ukrainians have presented to the world as manifestations of Ukrainian cultural heritage can be kept safe and we can give the data back to people who work in cultural heritage, when the time comes to rebuild.
Gaymon Bennett, Associate Director: Could you say a little bit about what the method looks like and how you find what people are going to record, and then how does that work get parsed out to the network of people helping preserve these artifacts?
Dombrowski: It’s the world’s biggest spreadsheet. We live and die by our giant Google Doc spreadsheet that has just sagged under the weight of 80+ simultaneous editors at times. We started off with, and still have, a link submission form that anyone can submit a link to the archive. That was really good in the beginning, it helps us quickly get the kind of sites that people are most familiar with.
But then over time we’ve branched out in different ways. We pulled in everything from WikiData: all the WikiData for libraries, archives, and museums, turned up a lot of links that were out of date, because there was a one-time hackathon to upload things, but unmaintained WordPress sites get hacked and turn into casinos, things like that. We actually have improved the data, pushing updates back to WikiData.
We have a situation monitoring group that tracks the air raid alerts and other urgent warnings for people on the ground. We use that to dynamically reprioritize our work to focus on areas actively under attack. That process has also led to a group of people who virtually walk through the streets of cities under siege using Google Maps looking for the cultural heritage icon. In that way they’re turning up in museums that we didn’t know about, and when they check to see if they have a website, in many cases they do. We’ve also gone down rabbit holes like a platform for providing free websites to educational institutions — defined broadly — so this is how we found some children’s music and dance schools, and children’s libraries with children’s poetry. There are also many churches and monasteries that have information-rich sites, showing that angle and cultural heritage. There are hundreds of those church sites, but then unfortunately they’re CloudFlare protected and they blocked our cloud crawler, meaning that people had to capture them on their own laptops using the command-line Browsertrix software or manually using the WebRecorder plugin.
Grumbach: Can you talk about the importance of these smaller websites, which represent moments in time that are integral to national identity and cultural heritage? Why is it so scary that these sites are at-risk?
Dombrowski: Those are my favorites. I’ll confess there are other people who are here for the digitized historic documents, there are people who are there for the art. I am here for the fanfic and the children’s after-school programs and the random museums for things. It’s those things that capture everyday life that speak to me. You see this also in the museum sites and children’s library sites that have photos of events, things like the Christmas pageants, back-to-school day, and world embroidery day where all the kids show up in their embroidered shirts.
This is cultural heritage in a real, living way. This is how people engage with and build and contribute to the next generation of their culture. One of my favorite sites that I just happened to crawl, but I’ve checked back on a few times, was updating their blog daily up until the war started. One of their last posts was a children’s art project called “I’m Ukrainian and that sounds proud” and the kids were painting sunflowers and hearts and maps of Ukraine. There were pictures of children with this art — or, at least, there were when I first archived it. As time went on, I kept checking back on the site. This was in the city that made the news because of an ammonia leak, maybe a month or two ago, and it was scary thinking about these kids in the pictures and hoping that they were safe somewhere. As I kept watching this site, I noticed that they deleted the posts with patriotic art, I think to try to not become more of a target. I hope they had a backup of the photos. These were born-digital images so it’s not like there’s a physical photo out there, that’s a surrogate. Digital is all there is. But we have a snapshot of the site before the posts were deleted, which means we still have those photos. We’d like to give them back if the site owners don’t have a backup copy themselves.
Bennett: Does a certain amount of this work require interacting with people in Ukraine?
Dombrowski: You would imagine so, but we mostly work with public sites. So, if they’re online, we can capture them. We have been trying to be in touch with people on the ground in Ukraine. It’s been hard. Especially people on the ground who are in a position to preserve physical objects — they’re the only ones who can do that, and that tends to be their priority more than the digital.
We and our partners, including at the University of Alberta, have been able to offer free unlimited storage to Ukrainians, including cultural heritage organizations. Some of them are having people upload their internal documents and things that don’t exist on the public-facing sites.
But one of the upsides of this kind of public website work is that it can be done autonomously, without necessarily a lot of work on the other end. There was one morning when I woke up to an email from a sysadmin in Kyiv who had gone to our spreadsheet and, seeing that his site was down, and he had gone in and gotten it back up and he emailed us to ask, could we archive it quickly, please before it goes down again? And we were able to.
Bennett: Can you say a little bit about what it feels like to do this work, I know, like many others, I am like insatiable consumer of news around this stuff but it feels removed in a certain way, and so I wonder if you could say just a little bit about what it’s like to be involved?
Dombrowski: It’s great because it means I don’t have time to think or watch the news. I sort of watch the news through our situation monitoring, so hour by hour I’m vaguely aware of the air raid alerts, but before getting involved with this, it was just doom scrolling and feeling miserable. And I expect at some point, the weight of all of this is going to come crashing down, but I have been keeping busy enough that there’s been no time for that. We’re trying to do everything all the time and that has itself been an antidote. A lot of volunteers have mentioned they’ve donated to refugees but they still sort of had that doom scrolling impulse and that was sucking them down into a bad place. With archiving sites, there’s always more to do and that’s given them some kind of outlet.
We have a couple new sub-projects in the works. One of them is to get scanners and equipment to cultural heritage institutions that want to digitize more of their collections. Amazon Poland has expressed interest in being able to send organizations those things, and so we’re trying to coordinate that work. But one of our volunteers on the ground suggested we get a scanning station set up as part of refugee resource centers, places where people get food and medicine. What if there was also a scanning station where they could scan their own documents so they have a backup copy? And if they wanted to, they could share with us one of those things and sort of like the first or last thing that I grabbed from my apartment before I left. So that is still in the early stages, but something that we’re working on.
Erica O’Neil, Research and Project Manager: Would you please talk a little bit about how people can get involved and support the project?
Dombrowski: At this point we’re trying to balance the projects that we have versus the volunteers. For a while, we were largely at capacity for the archiving work itself, but there was still more work to be done on metadata for files that we had scraped from sites and uploaded to the Internet Archive. Then we got a list of every website registered in the Ukrainian namespace, which yielded thousands more cultural heritage sites. So now we’re looking to scale up the web archiving team again.
We’re starting this digitization project, and trying to put together Omeka galleries with people in Ukraine to highlight their collections and make it so that they can raise money for their own institution. There’s also a Meme Team that is tracking and documenting memes about the war across multiple languages. Because our needs are always changing, we have a web form on our site for people to sign up to volunteer, which lists the current activities. We would love to just say, “Everyone, come help!” but the time to wrangle all of the new volunteers is also something that we’re juggling. If people sign up via the form (https://docs.google.com/forms/d/e/1FAIpQLSc6KbhtEOI8zKsQmKT_waE1XlYEF1E6t-HzJ7Gc1EBfMvMg_A/viewform) , we’ll be adding people to projects as opportunities present themselves.
Grumbach: Thank you so much, Quinn. Do you have any final thoughts you’d like to share with us?
Dombrowski: SUCHO is inspiring in the way that people going out to the main plaza in Kyiv with beer bottles and hand sanitizer to make Molotov cocktails is inspiring. It’s a great story, but it also reflects a failure of larger infrastructure. And the message that we’re trying to send to cultural heritage institutions coming out of this is that there should not be any more SUCHOs, at least not at this scale. These institutions need to do more planning together around preemptive web archiving so that the next time there’s a war or natural disaster, there’s a basis of things that we know are safe and there will be less emergency labor that has to be thrown together at the last minute. We think it’s possible, even though these organizations move slowly. Maybe not in time for the next cultural heritage crisis, but perhaps the one after that.