Making a Whimsical Data Visualization

Data Visualization of Randy Rainbow Videos

A few months ago, I saw Randy Rainbow’s live comedy show, in which he incorporates some of his song parody videos. It was wonderful to laugh so much, and with a whole theater full of people. I left feeling a little lighter, and curious about how such a great show came to be. I was curious about how Randy Rainbow developed his video style.

He couldn’t have just appeared out of the blue. How did his style and technique evolve over time? How many of his videos could I find?

A quick search brought up several of his most recent, and most popular. To find more, I explored the ‘up next’ list. But to find older videos, I had to dig around.

It took a while, but eventually, I’d found around 100 videos by Randy Rainbow on youtube. To keep track of all the videos, I created a simple spreadsheet, and when I found a video I’d add the video’s link to it.

Starting out, my data set was just a list of links. Once I had a list of videos and their links, I collected some additional information: video length, views, transcripts, publish date, and so on.

Simple spreadsheet, for recording and updating data on each video.

For my initial visualizations, I started out looking at how popular each video was, and I used the video ‘views’ as a metric for this. In 2016, the frequency of his video releases increased. I’m quite fond of bee-swarm plots, and it made sense to use one with such clumped data.

First iterations, showing how popularity of videos started to grow in late 2015 and 2016.

This was when I noticed that most of his recent and popular videos were song parodies, and that these really started to take off in late 2015 and early 2016. Maybe I could draw that out in the visualization.

Then I found his press-kit, with these fun images. The colors were perfect! It gave me the idea of color coding which videos were song parodies and which had different formats.

The image files in the press-kit are square. So I had to experiment with using a circular clip-path, in the visualization’s javascript, if I wanted to keep the bubbles. I worked out the clip-path code in a simpler chart, before integrating it into my bee-swarm plot (with it’s more complicated code).

Experimenting with clip-path and color-coding video type.

Then I noticed that a video I was watching had a transcript. Oh! That gave me ideas! Did they all have transcripts?

Well, a lot of them did. But not all. I’d have to figure something out later, for the missing transcripts. There were enough with them, that it was worth the effort of exploring them.

It had been a while since I’d done any NLP coding, so I spent some time exploring basics in python and NLTK. Pretty quickly, I had a little script that looped over the transcript files, and generated a simple JSON formatted report with the most common words in each video.

Console output showing most common words in each video transcript.

So now I could start playing with using these words in the visualization. My first attempts were very basic. But they helped me decide how to deal with the missing transcripts…by showing “no transcript” instead of the video’s most frequent words.

Showing most common words in the videos, and indicating ‘no transcript’ for some.

While technically interesting, this was visually boring. So I started playing with some other ideas. I ended up creating little word-burst animations. If the video was a song parody, there would also be little musical notes, in addition to words.

Musical notes with most common words in the video’s transcript.

The annotation, on the side, shows a little information on each video. It’s triggered by a mouse event on each bubble, but I’ve also coded it to have a default annotation when the page loads. I think it looks best when the default annotation is pointing to a bubble on the left side of the chart.

Recently, I added thumbnail images (from the videos) to the annotation. I’m still working on that, but the current online version has it.

Video thumbnail in ‘details’ annotation.

The current version of this whimsical data visualization is online! Mouse over the bubbles to trigger the word-bursts.

I’m working on a major addition to the visualization, but it’s not quite ready yet. In the mean time, here are some out-takes and pretty bugs I’ve created so far.


Out-takes and Bugs

Some pretty bugs.
An early version that looked a bit too much like a mushroom cloud. Led to longer chart size.

I’ve also written a little about my data collection, from a more technical perspective, if you’re interested. It’s more for my own reference, when I need to update the data set. Be warned: it’s a bit clunky and a work-in-progress.