Digitized archives and stateless people: New windows on those usually invisible

Ismee Tames
3 min readJun 10, 2022

--

Taking up the challenge to get a deeper insight into the experiences of stateless people in the era of the world wars means I need to find ways to look deeply into the available sources.

Since my focus is on people who were recognized by the “international community” of that era — the League of Nations, the big NGO’s like the Red Cross, etc. — I’ve made it a little easier for myself: at least there is a paper trail! Albeit maybe not primarily created by the stateless people themselves.

Still, looking closely it is possible to see the stateless themselves through the framework of for instance the Nansen Office and their representatives who acted as a kind of consular aids to this group, as became clear for instance with the case of Peter Tumanoff.

Digging into the topic it quickly becomes clear that the League of Nations Archives contain 1,000s of documents about this group of Nansenists. And the Arolsen Archives again 1,000s and 1,000s.

Part of a scan from the League of Nations Archives with information on identity certificates for Armenian refugees

I cannot read all these documents closely. That would take a lifetime. I could sample of course or select documents based on some pre-defined criteria (and maybe missing an opportunity to discover something new or exciting). I could now also experiment with handwritten text recognition software, that allows for automated transcription into machine-readable (and thus: computer searchable) text.

In the ‘ideal world’ I would run the software on the huge digitized collections from the League of Nations Archives or the Arolsen Archives and then start playing around with the results: filtering, organizing in various ways, thus getting a feel for the data that otherwise would remain hidden. This could help me make clusters of topics for example, or think in a more advanced way about sampling. The possibilities seem endless!

However, although wonderful tools like READ COOP’s Transkribus are developing quickly, this is not (yet) possible. It is already ‘relatively easy’ — which is not the same as quick or without any setbacks on the way! — to train a model that automatically transcribes handwriting in a certain language.

But it is still difficult to transcribe forms with multiple languages written and printed on it, in various handwritings, in columns and in the margins, etc. The screenshot presented above may give a modest indication of the kind of material that historians find 20th century archives and the challenges these scans bring.

Last week Tuesday we discussed these and many related challenges in a workshop(link) with colleagues from WARLUX, University of Tampere and NIOD. This was a continuation of an earlier workshop last January.

These conversations will be continued since it’s only via our trial-and-errors as researchers, information specialists and archivists that we move this process forwards.

Would you like to link up with us? Follow this blog or find me on LinkedIn

--

--

Ismee Tames

I’m a researcher interested in meaning making in times of crisis and violence. I study the recent past to focus my lens and get a clearer picture.