Judge a NYT bestseller by its cover

Hannah Yan Han
Aug 26, 2017 · 2 min read

The cover of a book makes an important first impression. I wonder which colors tend to make the covers of NYT bestsellers.

As a proof of concept, I collected the cover images of any book that made any of the NYT bestseller list in the past 2 months, and plotted in three primary colors used in each of the covers with rPlotter package.

The underlying concept of color extraction is strikingly simple: using K-means clustering to group RGB of each pixel and extract cluster average. I used 3 clusters here as people tend to a few main hues and their neighboring/similar colors rather than 5 or 6 distinct hues.

A few possible next steps:

  • One can extract more book cover images, and do a cross-genre (fiction vs non-fiction, adult vs kid etc) or cross-country (Amazon best-seller by country) comparison
  • the rPlotter package only return colors but not proportion of colors used in the image. Other APIs can return proportion of colors too.
  • One can convert the resultant main color into RGB and cluster them again to observe which hues make more bestseller and in which genre.

This is #day52 of my #100dayprojects on data science and visual storytelling. Full code on my github. Thanks for reading. Suggestions of new topics and feedbacks are always welcomed.

Hannah Yan Han

Written by

#100daysproject on data science and visual storytelling ✈️🗺️

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade