peanuts.today | Exploring the Peanuts in the Cloud with Machine Learning (Part 1)

Craig Burdulis
2 min readNov 1, 2019

Generally my side projects outside of work veer away from software development, but every now and again I get drawn into developing a solution that 1) I’m very surprised does not exist in the wild and 2) peaks my interest in one of my other hobbies/interest.

I recently built (and am continuing to add new features) a website, peanuts.today, to serve up daily Peanuts comics. It’s my hope that this website will become a useful destination for fans of the comic to explore the strips and hopefully make fans of many denizens of the internet.

Original Intent

I had recently been looking for a simple RSS feed/daily newsletter that I could sign up for to receive a Peanuts comic as a notification/email. I have a similar setup for new xkcd posts using Pushbullet (which works great), but I could not find anything similar for Peanuts comic strips. That led me down the road of designing my own.

New Possibilities

In researching exactly how to build such a solution, my mind started going in various directions on other interesting kinds of solutions I could built on top of a relatively large data set of images (~17,000 comic strips). In looking at the Charles Schulz Museum’s Online Collections Database (which I encourage people to explore), I found that it was very difficult to find all strips, let alone strips that match certain criteria (e.g. “show me strips with Peppermint Patty about Christmas”). As a result, my ultimate goal is to develop a platform for exploring the archive of Peanuts strips in a user-friendly and intuitive fashion, such that both researchers and the general public can find comics that match their interests. Some possible criteria for searching might include:

  • Publication Date
  • Characters that appear in the strip
  • The text that appears in the strip (e.g. “good grief”)
  • Objects that appear in the strip (e.g. a football)
  • Overall comic sentiment (e.g. happy, melancholy, angry, etc.)

Several of these criteria are fairly simple to calculate/search on, while others would require a large scale effort to manually label the comics. To avoid this and make the process as efficient as possible (and to make it more interesting), I’ll be using machine learning to train a deep learning model that will hopefully be able to accurately perform many of the tasks above (specifically identifying characters).

Posts

I’ll be documenting my progress with this effort over the course of several blog posts, roughly broken up into the following parts:

  1. Introduction (what you’re reading now…)
  2. Cleaning/Formatting Dataset
  3. Ingesting Data & Publishing on Website
  4. Enabling Email Subscription/Delivery
  5. Exploring Machine Learning
  6. and possibly more…

--

--

Craig Burdulis
0 Followers

Software Engineer | Bread Baker | Peanuts Fan