Cloudy with a Chance of Words: Part 2

Enhancing Your Word Clouds With More Unique Visualizations

Payson Chadrow
6 min readAug 14, 2020
Image by author

In this blog we’ll build upon what we covered in the first part and continue to use the same example text of Moby Dick from the nltk corpora. This is where things will get interesting and more artistic than informational. If you just want a quick and easy word cloud then Part 1 may be all you need, but if you really want to make your presentation stand out with something more unique, then you’ve come to the right post! For any parameters used within the WordCloud object that you’re unsure about, please refer back to Part 1 of this blog or the wordcloud library documentation

Using Image Masks to Shape your Cloud

Reminder of the libraries we’ll be using

One of my favorite ways to use word clouds is by masking them with an image that relates to the text. To start, we’ll talk about images that already have a transparent background before moving on to manipulating our own images for more unique masks.

Since we’re using the text of Moby Dick, let’s use the below image, which I’ve already modified, for our mask:

Source: https://www.kindpng.com/imgv/Tiihwoo_transparent-whale-chicken-cartoon-whale-transparent-background-hd/

We can verify the image by opening it with the following command:

What we need to do next is convert our image into an array so that it can be appropriately interpreted by our WordCloud object.

Now, because this image has a transparent background, these “empty pixels” are filled with zeros which equates to black. The way masking works in this library is that it maps text on to all surfaces that essentially aren’t white, so we’ll need to transform these zeros to 255 so that they are interpreted as white.

The following function should work for images of all sizes:

(Note that this will turn all pure black into pure white, so if you’re trying to use an image that actually has black already in it, you may have to find another method for this transformation. Otherwise you may see blank spaces appear inside your mask.)

As we can see, all of our zeroes are now 255 which means we’re now ready to create our clouds!

Well…that’s not quite what we were wanting now was it? Now not all images are ideal to be used for masking purposes. Ideally we want simplistic images of the desired shape. Too many colors or a matching of image colors to the background (or cases of pure black in this instance) can cause our masks to turn out like the above.

Let’s use a more simplistic image to see the difference.

Source: https://www.etsy.com/sg-en/listing/674584357/whale-style-2-stencil-made-from-4-ply

So it looks like our background is already pure white and our image we want to draw on is set to pure black. This means that we actually don’t need to transform this image at all.

That looks better!

If you notice, I left out the contour on this mask. Setting a contour_width effectively gives your mask an outline and you can determine how thick of an outline it is. You also have the option to set the color of the contour with contour_color.

Finding Decent Images for Masks

There’s a couple things you can keep in mind when trying to find a good mask

  • The image you find has a solid background color
  • Preferably the background color doesn’t match any part of your image
  • The exact colors are unimportant as long as the above are met.

Now if you find an image that doesn’t have a transparent background but has a solid background color, you can proceed in two different ways.

First, if you view the array of your image and identify the color value of your background, you can simply substitute that value for 0 in our above function.

You can also use the following site to make the background in your image transparent and then apply the same functions and code as our first example.

That can take a little getting used to, but it does open the door to more options for use as an image mask.

Artistic Word Clouds

We’re gonna continue to veer further from informational and more into artistic with our final example.

You can map a word cloud onto any image, but this is where we’ll start to consider the color of the image. Sticking with our theme, consider the image below:

Source: https://fineartamerica.com/featured/moby-dick-1-jerry-lofaro.html

Would it surprise you to know that we can turn this into a word cloud? If you’re still reading, I’m guessing not because you’ve already come this far.

We essentially follow the same process for creating our word cloud with the only difference being that we will recolor it. In this library all coloring is performed after the word cloud is generated. Obtaining the image’s color and applying it to our cloud is only a few extra lines of code. To improve the overall image, you’ll notice the WordCloud parameters will have some significant changes, which, if you’re trying to apply this method for your own use case, you may need to play with these parameters a little more.

Zoom out if the image is difficult to discern.

Whoa! That’s pretty cool! Granted, reading the actual words and recognizing frequency is almost impossible and the colors in this specific image don’t quite pop as much as we’d like. However, if the artistic side of things interests you, hopefully this sparks some ideas for you to apply these methods to other images and texts. For those more interested in this side as well, you may enjoy the following link where a boundary map is used to reduce the amount of color washout.

Is This All Word Clouds Have to Offer?

Hopefully, I’ve given you a pretty good idea of what the wordcloud library has to offer. The library does still have other features and abilities that I haven’t covered. If you want to further explore the options within wordcloud then I highly encourage looking into their documentation and experimenting with other parameters. The options end where your creativity does!

--

--

Payson Chadrow

Just a guy playing with data, trying to find his perfect dataset