Judge a NYT bestseller by its cover

Hannah Yan Han
2 min readAug 26, 2017

The cover of a book makes an important first impression. I wonder which colors tend to make the covers of NYT bestsellers.

As a proof of concept, I collected the cover images of any book that made any of the NYT bestseller list in the past 2 months, and plotted in three primary colors used in each of the covers with rPlotter package.

The underlying concept of color extraction is strikingly simple: using K-means clustering to group RGB of each pixel and extract cluster average. I used 3 clusters here as people tend to a few main hues and their neighboring/similar colors rather than 5 or 6 distinct hues.

A few possible next steps:

  • One can extract more book cover images, and do a cross-genre (fiction vs non-fiction, adult vs kid etc) or cross-country (Amazon best-seller by country) comparison
  • the rPlotter package only return colors but not proportion of colors used in the image. Other APIs can return proportion of colors too.
  • One can convert the resultant main color into RGB and cluster them again to observe which hues make more bestseller and in which genre.

This is #day52 of my #100dayprojects on data science and visual storytelling. Full code on my github. Thanks for reading. Suggestions of new topics and feedbacks are always welcomed.

--

--

Hannah Yan Han

#100daysproject on data science and visual storytelling ✈️🗺️ https://www.hannahyan.com/