A multicoloured area chart of the genres of books I’ve read through my life

Mapping My Literature Landscape

Using data to unlock—and understand—my reading history

Nightingale
Published in
8 min readSep 3, 2020

--

Back when I was a kid, I used to read A LOT of books. Then, over the last couple of years, movies and TV series somehow stole the thunder, and with it, my attention. I did read a few odd books here and there, but not with the same ferocity as before. I could also feel my attention span dwindling so I had trouble reading longer, slower-paced books. It was always easier to consume media to satiate my curiosity and experience something new, especially with the rise of beautiful video essays.

Everything changed this year. I learned to let go of the expectation of finishing a book and go back to why I enjoyed reading them in the first place. As a consequence, I’ve read more books in the first six months of this year than I have in the last four years combined. This led me to questions about my reading patterns. How have they evolved over time? What are my favourite genres?

I already knew these answers for movies. I’ve been on Letterboxd for a long time. For the uninitiated, Letterboxd is the better version of IMDb. It’s a site with great design, an awesome rating system, and has a great community of people who love movies and make brilliant lists. Above all of these though, the thing I love most is their stats page. A quick glance shows me that I’ve watched movies from 47 countries and my most-watched director, to my absolute dismay, is David Dhawan.

I wanted something similar for books. My first stop was Goodreads but their stats page was nowhere as good as Letterboxd. I did some more research and came across The StoryGraph. It’s a new site that’s in beta but shows much promise. Each book is tagged with parameters like “mood” and “pace” of the book which they consider while giving personalised recommendations. Think of Spotify’s Echonest algorithm but for books. The nerd in me was delighted.

Logging all the books I’ve read was an arduous task. I had logged some books historically on Goodreads but not the entirety of my reading history. The remaining books I logged through memory, with help of social media posts, notes, letters, old photos, birthday gifts, etc. By the end of it, I logged a paltry 168 books and imported them into The StoryGraph, which has its own stats page, which in turn gave me ideas to create new visualisations. I reached out to Nadia from The StoryGraph and she sent me a dump of all of the data that I had logged even though the feature is not open to public yet (thanks Nadia!).

A quick word about the data. I have not logged any comics I’ve read yet. Two reasons why:

  1. I’ve read too many (at least 800 issues of Batman, 300 issues of The Flash, 100 issues of Tinkle, etc.) and logging them is going to take some time.
  2. Secondly, this would massively skew my data so I’ve not added them as yet. I logged in almost every book I could remember reading.

Also, this is not the entirety of what I’ve read. A lot of old books are not present in the Goodreads/The StoryGraph database so those I haven’t been able to log. Eventually I managed to tag all the ones I imported and came up with some rudimentary visualisations. Let’s dive in!

The Surface Dive

Pages

For the first one, I bucketed books into three: < 300 pages, 300–500 pages, 500-plus pages to see their progression over time. I started out with a lot of small books and then when I finished my board exams, I read a LOT of large books for the next couple of years. Then I started working and still read a lot of medium sized books. Over the last couple years, I’ve gone back to reading a lot of small and medium-sized books and I hope to read a large book soon.

An area chart mapping the number of pages of each of the books I’ve read through my life.

Pace

I then charted pace of the books I’ve read as characterized by The StoryGraph. I generally prefer fast-paced books because I need something that holds my attention. The data backs that up, showing that fast-paced books do make up the majority of what I’ve read. I think (I could be wrong) an author has to be particularly skilled to hold someone’s attention when it’s a slower-paced story.

This made me wonder: Do we have more fast-paced books now to go along with people’s dwindling attention spans? Mapping the publishing landscape would be a worthy subject for a future post, but in the meantime it’s made me consider how the evolution of publishing might impact my own reading patterns.

Anyway, despite my preference for faster pace, I do enjoy reading slow-paced books as well, and I was really surprised to find that almost all the books I’ve read this year are slow-paced. That wasn’t a conscious decision on my part, but could be due to another shift I uncovered in my reading habits:

An area chart mapping pace of each of the books I’ve read through my life.

Fiction

Going in, I already knew the fiction/non-fiction divide. I’ve always preferred fiction and even scoffed most times at non-fiction. When I was young, non-fiction books were basically reference or self-help books. Now that we have very accessible online tutorials for anything you may want to learn, I didn’t really have the need for them. That was until I read Meggs’ History of Graphic Design in 2016. That book was a catalyst for me to see what other golden knowledge lay hidden in these books. I knew I’ve read a lot of non-fiction recently but was surprised to find that all the books I’ve read this year are non-fiction, including the book I’m currently reading. I think I got tired of watching too many video essays, so went on a knowledge rampage when the lockdown began and hence the spike.

An area chart mapping the fiction/non-fiction split of all the book I’ve read through my life

The Entire History

The breakouts were cool but I wanted to see my entire history using a single visualisation. Also, since there are multiple genres of a novel, I’ve tagged each book with a “main” genre for this visualisation. This took some time since there are some books that gave me sleepless nights. Is Audrey Niffenegger’s The Time Traveller’s Wife romance, or sci-fi? How about Vikas Swarup’s Q&A? What even is Old Man and the Sea?

Enter the below visualisation. It looks like a sankey diagram but is spiritually a parallel sets plot (which deals more with categorical data and how it is classified). The genres on the right are ordered by appearance in which I first read them.

A parallel sets diagram which looks like a Sankey chart. Maps each year to type, pages, pace, and main genre of all the books
An interactive version of this visualisation is available on my website

I’ve always been a huge fan of fantasy but was very surprised that I hadn’t picked any fantasy book up until very late into my reading years.

The same chart as above, but with the Year 2000 highlighted to show when I first started reading Fantasy

Once I did, I was unstoppable. 2004 was entirely thriller and fantasy!

The same chart as above, but with the Year 2004 highlighted to show that all books were Thriller & Fantasy books.

The Landscape

I still wanted to see all the genres I’ve read across the years so I made the titular “Landscape of Literature” for genres. The graph at the beginning of the article is exactly that, but while it looks pretty, it’s not the best for analysis, which is why the Ridgeline Plot is a better option. I’ve accounted for multiple genres of the same book in this visualisation.

Adventure was a huge theme when I began (Journey to Center of the Earth, Treasure Island) before reading a lot of mystery books (Secret Seven, Famous Five, Hardy Boys, Nancy Drew). Then graduated to fantasy (LOTR, Harry Potter). The spike of mystery, history, and thrillers starting from 2003 can be attributed to The Da Vinci Code and all the books similar to it that I read. Fantasy then makes a comeback because of A Song of Ice and Fire and continues almost through the whole decade. A big twist was Sci-Fi; for a genre that I love so much, I’ve read surprisingly few. In the last couple of years, non-fiction genres like design and business have been surging owing to a lot of reference books I’ve been reading and catching up on.

A ridgeline plot mapping all genres of all the books I’ve read. Coloured and shaded to look like hills

My last visualisation deals with the moods associated with the books I’ve read. Although adventure as a genre tapered out in my childhood, almost all the books I’ve read are adventurous in spirit. The second-most prominent set are dark, mysterious, and tense books that almost increase in frequency on cue as lighthearted tapers off.

A ridgeline plot mapping all moods of all the books I’ve read. Coloured and shaded to look like hills

That’s a walk through the history of my reading so far. Feel free to linger around and dive into the data. If you have any suggestions please feel free to drop them in! You could have insights slightly different than mine so please feel free to leave them in too.

This was a great eye-opener for me. Some of my pre-conceived notions were broken, and I picked up on some very interesting patterns I would’ve never seen if I hadn’t visualized it. Why haven’t I read adventure books since I was a kid? Is it because there are very few true adventure books out there now or haven’t I put enough effort in finding them? This is what is great about data and why I love it. Data is a mirror you can use to reflect upon yourself and confront your biases. What you do with it is then up to you.

I still have multiple improvements planned for this and more ways to dig into data — author diversity (how eurocentric is my author list?), book formats (when did I start embracing eBooks?), the difference between my “Read” and “Want to Read” lists, etc. I also want to make this a real-time dynamic page à la Letterboxd. Let’s see how I get along with that a month or two from now.

My name is Nimit Shah and I’m a designer, coder, and data guy who loves integrating the three in interesting ways. This article first appeared on my website (supports dark mode and has an interactive chart!). The code for ridgeplot and the parallel sets can be found on Observable.

--

--

Nimit Shah
Nightingale

Design, UI/UX, Tech, Data, Food, Dance, Music, Movies, Books