The Art of the Qur’an: A Basic Text Analysis

The weekend before the 45th Presidential Inauguration, I decided to plan a trip to Washington D.C. Living in the Northeast it’s no surprise that it’s an easy trip to make with plenty to do. Being a Muslim-American I made sure to make a stop at the Diyanet Center of America, the D.C. area’s Mosque and Turkish Cultural Center. It’s modeled after the Blue Mosque in Istanbul, Turkey and, if you’re in the area, I highly recommend visiting.

Photos I took at the Diyanet Center of America.

Later that day, I proceeded to walk the National Mall. I began on the West end visiting the monuments and memorials. The Lincoln Memorial, as cliche as this may sound, was my absolute favorite.

After finishing up on the West end I made my way over to the East side of the National Mall. This portion is made up of Museums and is concluded with the U.S. Capitol. As I made my way over to the Capitol, I stumbled upon the Freer and Sackler Galleries. A free exhibit hall and just my luck the current exhibit was The Art of the Qur’an.

The exhibit was full of history, everything from Calligraphy to a Qur’an with pages as big as a queen sized bed to a Rahle (book-stand for reciting Qur’an) belonging to Sultan Ahmed I.

Left: Rahle of Sultan Ahmed I — Right: A Qur’an opened to Surah Al-Alaq (Chapter 96).

Seeing the many Qur’ans on display and their history sparked some thoughts on the text itself. I have read and recited the Qur’an before but I wanted to go deeper. Beyond the meaning, and to the numbers. How many words? What words are often being repeated? On average, how long are the words? What is the most common letter?

I made it to the Capitol some hours later.

Being someone who works in Big Data, I used a couple data visualization tools (Excel, R, and Tableau) to analyze the text. Here are my findings.

A few brave people, long before analytics and computer software, did some of the hard labor for us and physically counted different aspects of the Qur’an. According to this finding we have: 114 surahs (chapters), 6,236 ayas (verses), 77,449 words, and 320,015 letters.

To see which words are being repeated often, I made this word cloud in Tableau.

Word Cloud Created by Me in Tableau; the larger the word the more it appears in the Qur’an.

This visualization does not show every word in the Qur’an, just the words appearing 10 or more times. That’s 947 words. It should not be a surprise that the Arabic word for Allah(الله) appears the most. If there was a simple way to translate and visualize this exact word cloud to English, I would. The words appearing the most are Allah(الله), Except (الا), Nor (ولا), Recite (قال), Earth (الارض) and a few others.

Since I found the most common words I wanted to look at the breakdown of letters.

Arabic Letters and their representation in the Qur’an. Mined data in R and visualized in Excel.

Alif, translated to “A” in English, is the most common letter. This makes sense since the most common word, Allah, begins with Alif.

I wanted to do a more in-depth analysis but dealing with the Arabic language in analysis softwares is kind of painful.

If anyone is interested in my code please feel free to reach out. The text file I used can be found here (I chose “Simple Clean”, unchecked all boxes, and chose “Text”).