#DataDeepDive: Scripts & Languages of the Geniza

In this series, we take a deep dive into the Talk boards tags to look at how volunteers classify the fragments. You can read an overview of our Talk boards tags in the Sorting Phase Data review.

In our project, we ask volunteers to sort into Hebrew and Arabic script, two languages frequently found in the Geniza. But as many volunteers have found in sorting the Cairo Geniza, just because something is written in a script does not mean it is written in that language. Through the Talk board tags, knowledgeable volunteers have worked together to identify some of the languages (and other scripts!) that appear in the Geniza. The numbers following each tag here refer to the number of subjects with that tag — sometimes, the same tag would be used multiple times on the same subject.

As noted in previous posts, this doesn’t mean that the subjects definitively are written in the script or language of the tag. Because we are looking at these tags out of context of their conversations, volunteers may have been guessing or suggesting the script or language of the subject. As our content specialists review this list, we hope to confirm these counts and provide detailed listings of these languages across Geniza collections.


Hebrew Script

Aramaic (37) is a Semitic language that is cognate with Hebrew. Hebrew script is derivative of the Aramaic alphabet. (See Judeo-Aramaic for explanation.)

Subject 11583670: Halper 121, University of Pennsylvania, Herbert D. Katz Center for Advanced Judaic Studies Library, Cairo Genizah Collection

Judeo-Arabic (167) means the subject features Arabic text written in Hebrew script.

https://www.zooniverse.org/projects/judaicadh/scribes-of-the-cairo-geniza/talk/subjects/21708003

Judeo-Aramaic (20) means the subject features Aramaic text written in Hebrew script. (By the Middle Ages, Aramaic text would usually be written in Hebrew script — so scholars wouldn’t use this term. We’ve separated the two tags here to reflect volunteer input.)

Subject 12504187: ENA 2616, Library of the Jewish Theological Seminary

Judeo-Persian (15) means the subject features Persian text written in Hebrew script.

Subject 11609784: ENA 2627, Library of the Jewish Theological Seminary

Volunteers who could read Hebrew often tagged subjects as Judeo_Something (33), meaning a volunteer did not identify the language but suspected it was not Hebrew based on what they could translate.

Subject 21709071: MS T-S J1.50, Genizah Research Unit, Cambridge University Library

Ladino (1), also known as Judeo-Spanish, is a language popular among Sephardic Jews. The tag was used three times in our project, but in context the conversations that made it clear the subject was not Ladino. However, we did find at least one example of Ladino in the project so far that was not tagged:

Subject 12602955: ENA NS 37, Library of the Jewish Theological Seminary

Other Scripts

While Hebrew script tags were frequent, volunteers also found non-Hebrew scripts in the project.

The Coptic (2) alphabet was first used for the Egyptian language— it’s still used in Coptic liturgy today.

Subject 11584410: Halper 470, University of Pennsylvania, Herbert D. Katz Center for Advanced Judaic Studies Library, Cairo Genizah Collection

While we often find Hebrew script used for other languages, we had at least one case where Arabic script was used for a different language, tagged nonarabiclanguage (1). One of our moderators identified the Arabic script in Subject 12511220 as Ottoman Turkish.

Subject 12511220: ENA 3904, Library of the Jewish Theological Seminary

In cases where multiple scripts or languages were found on a subject, volunteers used the tag mixed_languages (132). Subject 21953297 (below) has both Hebrew and Arabic script.

Subject 21953297: TS 16.135, Genizah Research Unit, Cambridge University Library

Volunteers have also found subjects that featured neither Hebrew or Arabic script. Those subjects are out of scope for the transcription phase for this project, but important to note in realizing the diversity of languages and scripts within Geniza collections.

Volunteers have found Roman/Latin script, like English (21), Italian (8), and Latin_script (7).

From left to right: English text on Subject 11584256: Halper 408, University of Pennsylvania, Herbert D. Katz Center for Advanced Judaic Studies Library, Cairo Genizah Collection; Italian text on Subject 12502384: ENA 2420, Library of the Jewish Theological Seminary; Latin script on Subject 12602705: ENA NS 31, Library of the Jewish Theological Seminary

In the sorting phase, volunteers identified at least one fragment with Cyrillic script with Russian (1) and Georgian (1) — more have been found in the University of Manchester Library’s collection. Subject 12602863 features Russian, Georgian, and Arabic text, as well as a striking scene of a judge on its verso.

Subject 12602863: ENA NS 35, Library of the Jewish Theological Seminary

Subjects were also tagged as Greek (6), using the Greek alphabet. In Subject 21952244, a volunteer suggested that these Greek characters are likely magical gibberish, used to impress amulet customers.

Subject 21952244: T-S 12.207, Genizah Research Unit, Cambridge University Library

👉 Read more Talk conversations or start your own by participating in Scribes of the Cairo Geniza on Zooniverse!