WordCloud Based On Image

Creating wordcloud using different images

Himanshu Sharma
Apr 20 · 3 min read
Wordcloud(Source: By Author)

Wordcloud is a visual representation of clusters of words with different sizes according to their times of occurrence in the dataset. The more a word appears larger will be its size in the word cloud. It helps in understanding the sentiments or the occurrence of particular words in a dataset.

Wordcloud is an open-source python library that is used to create wordclouds. In this article, we will be using different images for creating wordclouds on them and also explore different datasets and types of wordcloud.

Let’s get started…

Installing required libraries

We will start by installing the wordcloud using pip. The command given below will install it.

pip install wordcloud

Importing required libraries

In this step, we will import the required libraries and functions to create wordcloud.

import osfrom PIL import Image 
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import gaussian_gradient_magnitude
from wordcloud import WordCloud, ImageColorGenerator

Creating wordcloud image

Now we will create wordcloud. You can use any textual dataset and image to make the wordcloud. I am using the image of a parrot to create the wordcloud in the parrot shape.

# load wikipedia text
text = open(os.path.join(d, 'wiki_rainbow.txt'), encoding="utf-8").read()
# load image. This has been modified in gimp to be brighter and have more saturation.
parrot_color = np.array(Image.open(os.path.join(d, "parrot1.jpg")))
# subsample by factor of 3. Very lossy but for a wordcloud we don't really care.
parrot_color = parrot_color[::3, ::3]
# create mask white is "masked out"
parrot_mask = parrot_color.copy()
parrot_mask[parrot_mask.sum(axis=2) == 0] = 255
# some finesse: we enforce boundaries between colors so they get less washed out.
# For that we do some edge detection in the image
edges = np.mean([gaussian_gradient_magnitude(parrot_color[:, :, i] / 255., 2) for i in range(3)], axis=0)
parrot_mask[edges > .08] = 255
# create wordcloud. A bit sluggish, you can subsample more strongly for quicker rendering
# relative_scaling=0 means the frequencies in the data are reflected less
# acurately but it makes a better picture
wc = WordCloud(max_words=2000, mask=parrot_mask, max_font_size=40, random_state=42, relative_scaling=0)
# generate word cloud
wc.generate(text)
plt.imshow(wc)
# create coloring from image
image_colors = ImageColorGenerator(parrot_color)
wc.recolor(color_func=image_colors)
plt.figure(figsize=(10, 10))
plt.imshow(wc, interpolation="bilinear")
wc.to_file("parrot_new.png")
plt.figure(figsize=(10, 10))
plt.title("Original Image")
plt.imshow(parrot_color)
plt.figure(figsize=(10, 10))
plt.title("Edge map")
plt.imshow(edges)
plt.show()
Wordcloud (Source: By Author)

Here you can clearly observe how we created a wordcloud in the shape of a parrot. We can also use CLI for creating wordclouds, an example is given below.

wordcloud_cli --text alice.txt --imagefile alice_mask.png
Wordcloud(Source: By Author)

This is how we can create a wordcloud image using the wordcloud library. Go ahead try this and let me know your comments in the response section.

This article is in collaboration with Piyush Ingale.

Before You Go

Thanks for reading! If you want to get in touch with me, feel free to reach me on hmix13@gmail.com or my LinkedIn Profile. You can view my Github profile for different data science projects and packages tutorials. Also, feel free to explore my profile and read different articles I have written related to Data Science.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Sign up for Analytics Vidhya News Bytes

By Analytics Vidhya

Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Himanshu Sharma

Written by

An Aspiring Data Scientist passionate about Data Visualization with an Interest in Finance Domain.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Himanshu Sharma

Written by

An Aspiring Data Scientist passionate about Data Visualization with an Interest in Finance Domain.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store