Photo by João Silas on Unsplash

Visual Exploratory Data Analysis(EDA) Part 2

3 min readMay 6, 2019

--

Continuation of Part 1

In Part 1, we explored the ‘key abbreviations’ dataset to learn about the features of how a Bible dataset is organized.

We are going to continue the visual exploration for the rest of the datasets below.

Fig 1. Recall the datasets in addition to the key_abbrev explored in Part 1.

Exploring english_keys: english_keys.head()

Fig 2. Briefly scan the head

And the tail: english_keys.tail()

Fig 3. Briefly scan the tail

Time to graph. First up, features ‘g’ and ’n’

Fig 4. The code above aims to display the Book Names by the assigned Genre ID

Fig 5. This graph clearly shows the name of the books and the Genre ID. Looks there is a lot of books written that fall under the Genre ID # 7 whereas ‘Acts’ and ‘Revelation’ are the only books that fall under ID 6 and 8

Exploring the Old and New Testaments

Fig 6. Clear labels for the the graph displayed below

Fig 7. Looks like the OT Genre IDs range from 1–3 while the NT contains pretty much all genres

Now I am curious to find out what each Genre ID represents: genre_keys

Fig 8. Shows the 8 Genre IDs and what they stand for

Now that we have a brief understanding of the key descriptions for the bible versions, lets explore a version.

Lets start off with King James Version : kjv.head()

Fig 9. It helps to read the data description to comprehend the column names

Might help to look at the shape of the KJV dataset before plotting : kjv.shape

Fig 10. Over 3000 rows by 5 columns. I am going to have to plot carefully

Lets try plotting the chapters and verses represented in the table above :

Fig 11. Results in the plot below

Fig 12. This graph is worth exploring more however, this is one way to briefly get an idea of the chapter and verse counts for KJV

I am curious to see what it would look like if I switched the x and y axis:

Fig 13. I switched the x and y from Fig 11

Fig 14. Doesn’t hurt to explore different angles of the same picture

Time to explore the Books and Chapters of KJV :

Fig 15. Note that I am using Pyplot to plot the graph shown below

Fig 16. The Book ID ranges from 1 to 66

Using a more descriptive plot to expand on Fig 16:

Fig 17. Note the I am using Seaborn to plot the graph shown below.

Fig 18. Here, we can better see the Book number where each Chapter is represented under

Part 2 of the Visual EDA series briefly shows which plotting libraries to use given a particular dataset.

I will keep updating more information as I learn more about this dataset. For now, I hope you found this somewhat informative.

Jupyter Notebook

Data Visualization

Pandas Dataframe

M L

Written by M L

A former teacher. A lifelong learner.

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams