Hey, Check Out This Language! — #3, Romani, a Balkan but also Indian Language

Kevin Sun
Sun Language Theories
10 min readJan 12, 2018
Distribution of Romani in Europe — wheel size=absolute population, shading=population in proportion to national population (source: Wikimedia Commons)

I’m currently making plans to attend the Polyglot Gathering in Bratislava, Slovakia, this June (as I mentioned after attending a similar event in Montreal last summer). As part of my preparation for the conference, I’ve been slowly familiarizing myself with Slovak, which isn’t too hard given the other Slavic languages I know.

I’ve also decided to brush up on my German and Hungarian, languages that were historically widely spoken in Bratislava, but not so much anymore.

Since it’s basically in the same region as well, I’ve decided that my Serbo-Croatian could use a bit of a tune-up too.

And I guess it wouldn’t hurt to brush up on some fo the finer points of Russian either. And review Romanian. And a bit of Albanian. And… basically the Polyglot Gathering has become one big excuse for me to revisit all the languages of East, Central and Southeast Europe that I’ve ever studied or dabbled in before.

At the same time, this focus on Eastern Europe is also giving me a great reason to look at a brand new language that I hadn’t studied in the past, which will also be the subject of my next post in this series on less-studied languages: Romani, a Balkan language that also happens to be an Indian language.

A Balkan Indian language? How did that happen?

Well, it’s complicated. Many aspects of the history of the Roma people (the people who speak Romani, also known as “Gypsies”, though the term is falling out of favor) are still unclear, as is their migration through the Middle East to Europe. However, linguistic evidence clearly points to an origin somewhere in Northwest India. From there, the Roma first migrated to Iran, then Turkey, and eventually reached Europe in the late Middle Ages.

Other groups along this migration route, like the Lomavren-speaking Lom people in Armenia and the Domari-speaking Dom people in the Middle East and North Africa, also speak Indo-Aryan languages and seem to share a common origin with the Roma. (First of all, the words Lom, Dom, and Roma clearly seem related to each other.)

The Roma have had a bit of a rough time ever since arriving in Europe. Their nomadic lifestyle has led to marginalization, ghettoization and criminalization in many European countries. They were enslaved for centuries in Romania, they were targeted for genocide by Nazi Germany, and many are stuck in impoverished slums throughout Eastern Europe to this day. Even the USA has had a bit of recent Romani-related controversy:

Screenshot of a Fox News segment from Summer 2017

What Does Romani Sound Like?

I’ll list more resources for listening to Romani further down, but first, here are a few songs:

The first song is the unofficial “Romani anthem” Gelem gelem (or Djelem djelem or Jelem jelem etc., depending on the dialect):

(The lyrics and an English translation can also be found on Wikipedia.)

Gelem means “I went”, and is interestingly very similar to the Bengali word for “I went” — gelam/গেলাম.

The second song is Sila kale bal, performed here by the Serbian-Romani singer Saban Bajramovic and the Bosnian folk band Mostar Sevdah Reunion:

(The full lyrics and translation can be found at GypsyLyrics.com)

Sila kale bal means “She has black hair”, and interestingly enough the Hindi for “black hair” is exactly the same — kaale baal/काले बाल.

So Romani Seems to Have a Lot in Common With Indian Languages, Right?

Yes, and as shown by Gelem gelem and Sila kale bal above, sometimes the similarities are really obvious. More frequently, Romani words and phrases will clearly have the same origin as ones in Indian languages, but with a noticeable divergence in pronunciation or grammar.

Out of all Indian languages, Romani seems to have the most in common with northwest Indian languages like Punjabi or Rajasthani, which makes sense given the historical origins of the Roma people. Since the Indian languages I know most about are Hindi/Urdu, Punjabi and Bengali (and a little bit of Marathi), I’ll mostly use examples from those languages.

One common feature of all Romani dialects is that they’ve simplified the Indian consonant system a bit — instead of a four-way distinction such as k/kh/g/gh, Romani has gotten rid of the voiced aspirated gh, changing it to kh while original kh often becomes x instead (like the “ch” in “Bach”).

Something similar happens in a few north-western Indo-Aryan languages. For example, the Hindi word for “house” — ghar — becomes kher in Romani, and it becomes khaar in Hindko, a “Punjabi dialect” spoken in and around otherwise Pashto-speaking areas of Pakistan. (Kashmiri also drops the aspiration but keeps the voicing, resulting in gara.)

(I’ve described the Hindi consonant system in detail previously, in my article about Shanghainese.)

Romani has kept two grammatical genders, masculine and feminine, which is less than Sanskrit, Marathi, Gujarati or Sindhi (three), but more than Bengali (just one) and the same number as Punjabi and Hindi/Urdu. The common Romani masculine ending is -o, which is the same as in Sindhi and Gujarati but unlike the -a ending of Punjabi or Hindi.

Romani also has a two-case declension system, with direct and oblique cases (and a less important vocative case), which seems most similar to that of Hindi/Urdu and Punjabi.

Example of Romani case system from “A Handbook of Vlax Romani”

In some other ways, Romani seems to have a bit more in common with Eastern Indo-Aryan languages like Bengali too, although it might just be that both languages simplified in similar ways independently. I’ve already mentioned the verb form gelem/gelam (“I went”) above, but more generally, past-tense forms marked by an “L” sound are common to both Romani and Bengali (and Marathi) but don’t appear in Hindi/Urdu or Punjabi. (At the same time, Romani makes a distinction between transitive and intransitive verbs in the past tense (a remnant of Hindi-style split ergativity?), which Bengali no longer has.)

Other similarities between Bengali and Romani include the fact that both have lost vowel length distinctions, as well as the distinction between indicative and subjunctive verb forms (although Romani did pick up a new Balkan-style subjunctive in Europe later on).

How Have European Languages Influenced Romani?

Just look at this chart for a second:

This chart shows that in Slavic areas, Romani borrowed words for “then”, “still” and “already” etc. from Slavic languages, while in Romanian areas they borrowed them from Romanian, and likewise for Hungarian, Turkish and German.

As you might expect, Romani has picked up lots of words like these from the local languages in countries where the Roma have settled, and every dialect has picked up different local influences. Sometimes these local influences also extend to grammar and pronunciation — for example, some Polish and Russian Gypsy dialects have picked up the “y” or “ы” sound from those languages, and dialects in some German and Slovak areas have developed a new infinitive-like verb form while most Balkan dialects don’t have one.

Two European languages in particular have had a major influence on Romani from early on.

First of all, the Roma apparently hung around in Greek-speaking areas (modern Greece and/or historical Byzantine lands in modern Turkey) for a significant period of time before dispersing across Europe. This is why you’ll find words like drom for “road” and kokalo for “bone” that are of Greek origin but appear in Romani dialects all over Europe. Another striking sign of early Greek influence is that all Romani dialects now use the Greek words efta, oxto, and enya for the numbers seven, eight and nine! (English speakers might recognize these numbers better in their ancient forms — hepta-, octo-, and ennea-) And grammatically, the fact that Romani now has definite articles (“the”, o/e/la/le in the noun declension chart above) I likely due to Greek influence as well.

Secondly, because of the long period of enslavement in Romania, many Balkan Romani dialects have been heavily influenced by Romanian, even if they are now spoken in non-Romanian areas like Serbia and Hungary. One noticeable result of this is that a lot of Romani words now form their plurals with the ending -urya, which comes from the Romanian plural ending -uri.

How Did You Learn Romani? Are There Textbooks?

Southeast Europe and South Asia are probably my two favorite linguistic areas in the world, so it’s really no surprise that I was going to check out Romani when I got a chance, although it did take a bit of digging to find good resources to learn the language with.

The first book I used to get started in Romani was Learn Romani: Das-duma Rromanes by Ronald Lee, which focuses on the Kalderash dialect of the northern Balkans (and especially variants spoken in North America, since the author is from Canada). This textbook does a good job of explaining the grammar of the language, and also includes lots of examples from Romani folk songs and folk tales.

The second book I used was A Handbook of Vlax Romani by Ian Hancock, which is more of an academic reference than a textbook. Vlax is a larger dialect grouping that is spoken across the Balkans and includes Kalderash, so many but not all of the forms in this book were the same as in the first book. The term “Vlax” comes from an old word for Romania (“Wallachia”), and is used for this group of dialects because they were subject to the greatest amount of Romanian influence.

(Sidenote: “Vlax” or “Vlach” is also related to the English words “Wales” and “walnut”, from back when the word just meant “non-Germanic (especially Celtic or Roman) foreigner”, and is also the source of the Polish and Hungarian words for Italy —Włochy and Olaszország.)

(Sidenote 2: Despite the historical connections between the two communities, the similarity between “Romani” and “Romania” is pure coincidence. “Romania” is named after the Roman Empire, while “Romani”, like “Lomavren” and “Domari”, most likely goes back to an old Indian tribal/caste name, ḍom/डोम.)

Beyond those two books, I’ve also started looking into the resources provided by the Romani Project at the University of Graz in Austria. In particular, the teaching materials they provide for various Romani dialects should help me get a better understanding of other Romani dialects outside of the Vlax group, like the ones spoken in Slovakia for example.

A sample from the A2-level teaching materials for East Slovak Romani, from the Romani Project.

What Else Is There to Read or Listen to in Romani?

My single favorite resource for listening to real Romani speech is Sweden’s “Radio Romano”, which releases 30 minutes of (mainly) Romani language content every day, occasionally with some interviews done in Swedish. There’s a bit of diversity in the dialects represented, which sometimes leads to difficulties in communication when they interview Roma from, say, Spain and France, though the primary dialect used by the presenters seems to be Vlax/Kalderash with a good amount of Serbian vocabulary mixed in, and a bit of more recent Swedish influence sprinkled on top.

Another interesting resource is Austria’s public broadcaster ORF, which puts out new Roma content on a weekly basis (and in the languages of other local “Volksgruppen” — i.e. the neighboring Slavic languages and Hungarian — as well). I haven’t listened to this as much, but it seems to use a variant of the Burgenland Romani dialect with lots of German and Hungarian influence. Since this is probably more similar to the dialect of Slovakia (especially Bratislava, which is right next to Austria), I might look into this a bit more later.

In terms of written content, the best I’ve found so far is the quarterly magazine dROMa from Roma Service, also based in Austria. The magazine publishes articles in both Burgenland Romani and the German translation, which is helpful for fully understanding the text (and also good German practice). The standard writing system for Burgenland Romani is quite different from Vlax, due to German influence — e.g. the “sh” sound is written as “sch”, while Vlax would use “š”.

And of course, there’s a lot of Romani music out there to listen to. “Gypsy music” is a rich genre with worldwide popularity at this point (and often conflated with Balkan music more generally, on which it has had a major influence), and while a lot of it is now produced in other languages like Hungarian or Serbian as well, there’s still plenty of songs with Romani lyrics out there. Gypsylyrics.net is a good starting point for exploring this.

I’ve been putting most of my Eastern Europe language prep on standby for the last month or so, since I was focusing on Indonesian for the 30-day language challenge in December, as well as a few other Asian languages prior to a trip back to China… (*checks calendar*) tomorrow.

That means I haven’t spent much time researching Romani lately either, except to listen to Radio Romano a few times a week. But once I get back from China, I’m definitely planning to get back to work on Romani. Who knows, I might even try to do a presentation on it at the Polyglot Gathering!

--

--