The Joy of Parsing

Fernando Becerra
May 29, 2019 · 7 min read
Image for post
Image for post

“Let’s draw some happy little trees.”

“Just beat the devil out of it.”

“We don’t make mistakes, just happy accidents.”

These are just a handful of the iconic quotes that Bob Ross said during the 31 seasons of his show “The Joy of Painting” and have become ingrained in our brains. But, how often did he actually say them? When did he include the phrase beating the brush in his vocabulary? More broadly, how did these phrases evolve with time? And, how are these phrases related to the paintings he drew? To find those answers and more, we analyzed the transcripts from all 403 episodes of his show and created an interactive tool to quickly explore and filter them. Get your brushes and your paints, and parse along with us!

It wasn’t until a few months ago that I discovered Bob Ross. His name came up in a conversation with friends, but I didn’t pay too much attention the first time it was mentioned. A couple of weeks later, while browsing shows to watch on Netflix, I saw “The Joy of Painting with Bob Ross” and told my girlfriend about the coincidence. She couldn’t believe that I didn’t know Bob Ross (his show wasn’t aired in Chile, where I grew up), so I started playing one of the episodes. I can’t explain how — although I’m sure you would understand — but I was totally mesmerized by the show: the paintings, the phrases, and the character. In a rush of motivation, I even painted my own canvas for the first time ever!

Image for post
Image for post
My first canvas, inspired by “Silent Forest”, Season 22, Episode 13.

All of this occurred while at Fathom we were developing tools to explore and understand large archives. Over the past few years we’ve taken our efforts in many different directions: to see what archives look like, to guide new users and help them learn about the collection, and to engage a viewer without distracting them with the sheer quantity of information. But while most of our office was building tools for serious document sets such as Facebook ads, Wikipedia articles, testimony transcripts, and reports, I grabbed the source codes, and decided to use those same tools to explore the transcripts of each episode of The Joy of Painting.

First, to get the transcripts, I used the Youtube API to scrape each episode’s subtitles from the official Bob Ross Youtube channel, where the people from Bob Ross Inc. kindly uploaded all of the seasons of the show. Once I downloaded and cleaned them, my approach was to find the most common words or phrases in the transcripts by counting tokens, n-grams and other meaningful text elements, for which I used Python’s nltk module. Unfortunately, the results were not very exciting since the most common phrases were introductory ones like run all the colors across the screen that you need to paint along with; painting instructions such as a little bit of that, let’s go right up, let’s go up here, let’s take a little, and maybe there’s a little; and the typical Bob Ross filler tell you what. That made me realize that what I wasn’t really interested in those phrases, but in the quotes that made him the pop culture icon he is now, so I used the same Natural Language Processing techniques to find out when and how many times he said them.

Image for post
Image for post
Number of episodes per season in which the quote is mentioned

He also encouraged us to remember the Golden Rule (“a thin paint sticks to a thick paint”) since the first season and until the last one. In Season 2 he started convincing us that we can do anything we want, anything, as long as we believe. It wasn’t until the third season that we started making big decisions, but with big decisions came happy accidents. (Remember: we don’t make mistakes.)

One season later things got wild and we started having some fun, like we just don’t care. In season 6, Bob Ross granted us unlimited power to move mountains and bend rivers, the same time at which we started getting crazy and beating the devil out of the brush. Only in Season 9 we started making those little noises, shwooop, hehe, or it just doesn’t work. He introduced us to Agony City in the 10th season, but, luckily, we never had the pleasure to visit it. Finally, Season 12 was the first time we took a bravery test and it quickly became more and more common in later seasons, but we passed them all with highest honors.

Analyzing Bob Ross’ quotes was fun and revealing in terms of how he shaped his language over time, but I wanted to dig deeper into this document set and started looking for different ways to gain insights from it. I got inspiration from a topic modeling project we did with Wikipedia articles, and decided to apply the same technique to the Bob Ross transcripts to group the episodes and try to find trends. To do this, I used a clustering technique called t-distributed stochastic neighbor embedding (t-SNE) that represents each document in a vector space and projects them into a two dimensional plane. In other words: t-SNE allows us to represent a document as a dot at a certain location. This wouldn’t be very useful without a visual representation, so I grabbed the pictures of the final paintings for each episode, and put them in the respective location of the transcript calculated by the t-SNE. The result is a landscape of Bob Ross paintings.

Image for post
Image for post
The Bob Ross Landscape

Did you recognize the groups? Me neither. At first sight they are not so obvious, so I had to start digging into the details. Analyzing the clustering algorithm results revealed keywords for some groups, but it didn’t explain everything, so I decided to try a more visual approach. For that, I created a tool that provided me a quick way to find patterns in the data by filtering episodes out based on colors Bob Ross used, elements he drew, and words he said in each transcript. The text and visual analyses complemented each other to help me identifying the common themes in the clusters.

Isolated at the top right we find the ocean paintings, characterized by palm trees, waves, and beaches; the mountain paintings by Bob’s son, Steve Ross, are grouped in the middle top, while portraits painted by other guests are on the bottom left. To its left we see paintings that contain a bridge; the strange group on the top left corresponds to paintings that use the Burnt Umber color, which, coincidentally, happens to be only episodes of the first six seasons (he then switched to Dark Sienna); and, finally, there is the great bulk of Bob Ross’ paintings in the middle, which smoothly transitions between winter, mountain, and forest landscapes. Going one level deeper, we can even distinguish those winter landscapes that have a cabin in them, those mountain landscapes with snowy peaks, those forests with waterfalls, and painting with barns. What amazes me from this image is that the paintings are not grouped based on how they look, but on the words Bob Ross used in each episode. This approach naturally lead to the formation of those clusters of paintings, and, by combining text and visual information, we can make sense of them!

Image for post
Image for post
The groups from the Bob Ross landscape

Once I was done identifying the groups, I realized how useful the tool I created was, and the potential it had for being an interactive way of exploring Bob Ross’s paintings. So I spent some time refining it, adding more functionality, redesigning it with help from Paul, and making it public for everyone to have some fun with it. The final piece is for those who want to paint one of his paintings but don’t know where to start: just choose some colors you want to use, some elements you want to draw, find your favorite canvas, and paint along!

Image for post
Image for post
Choose some colors, some elements, and paint along!

Dealing with very large document sets can be overwhelming, but we’re dedicated to make this process as easy as possible by developing tools to explore and understand them: to see what they look like, to learn about the collection, and to engage a viewer without too many distractions. Counting words, finding phrases, grouping documents, and revealing hidden patterns are just a few different ways of discovering and making sense of text.

I really hope you’ve enjoyed this little analysis. We’d love to hear from you if you have time to drop us a line at hello@fathom.info. Until then, from all of us here, I’d like to wish you happy reading and God bless my friend.

fathominfo

We build platforms and products for understanding data.

Fernando Becerra

Written by

Data visualization developer and data scientist. PhD in Astronomy & Astrophysics, Harvard University. www.fernandobecerra.com

fathominfo

We build platforms and products for understanding data. See the full archive of our writing on process, client work, and curiosities at fathom.info/notebook.

Fernando Becerra

Written by

Data visualization developer and data scientist. PhD in Astronomy & Astrophysics, Harvard University. www.fernandobecerra.com

fathominfo

We build platforms and products for understanding data. See the full archive of our writing on process, client work, and curiosities at fathom.info/notebook.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store