Data Visualization” a beaten up topic with a Twist- Ep3|Wordcloud

Mainak Mitra
3 min readNov 3, 2023

--

Photo by Natalia Y. on Unsplash

Level: Beginner — Intermediate

Background and Context: This is episode two of the series. We will focus on WordCloud and how different charting packages stand the test. In this episode, I will avoid the code sections for generic package imports. For more background and context please refer to Episode 1 below.

When generating word clouds, all three libraries — Matplotlib, Seaborn, and Plotly — rely on the external Word Cloud library for the core generation. Matplotlib offers a straightforward and flexible canvas to display these clouds, while Seaborn, built atop Matplotlib, can enhance their aesthetic appeal with its built-in styles. Plotly, renowned for its interactivity, doesn’t natively support word cloud creation, but it excels in providing an interactive display once the word cloud is generated. In essence, while the foundational generation process remains consistent, the choice between these libraries hinges on the desired level of styling and interactivity.

# Sample text data


In a realm where numbers reign and data flows,
Where insights hide and complexity grows,
Emerges a craft, both ancient and wise,
The artful dance of data viz, before our eyes.
Columns and rows, in spreadsheets confined,
Seek liberation, a canvas to find.
With brushes of color, shapes, and light,
Data takes flight, dazzling, bright.
Bar graphs rise like city skylines tall,
Pie charts spin, a colorful ball.
Heatmaps glow with intensity deep,
While scatter plots secrets softly keep.
Tales of growth, decline, and trend,
Narratives of beginnings and end.
In this dance, data finds its voice,
Revealing stories, giving us choice.
For in this union of art and math,
Lies the power to illuminate the path.
To question, to learn, to understand, to act,
Data visualization, a pact.
To bridge the chasm, wide and vast,
Between present, future, and the past.
So here we stand, in awe and amazement,
Witnessing the dance of data visualization's statement.from wordcloud import WordCloud

Matplotlib

Code

import matplotlib.pyplot as plt

wordcloud = WordCloud(width=800, height=400, background_color='white').generate(text)

plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()

Observations

  • Flexibility w.r.t Customization: Limited
  • OOTB Support: Yes
  • Code Complexity: Low
  • Interactive/Static Visualizations: Static

Seaborn

Code

import seaborn as sns

sns.set_style("whitegrid")

wordcloud = WordCloud(width=800, height=400, background_color='white').generate(text)

plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()

Observations

  • Flexibility w.r.t Customization: Limited
  • OOTB Support: Yes
  • Code Complexity: Low
  • Interactive/Static Visualizations: Static

Plotly

Code

import plotly.graph_objects as go

# Generate a word cloud
wordcloud = WordCloud(width=800, height=400, background_color='white').generate(text)

# Convert to Plotly figure for interactive display
fig = go.Figure(go.Image(z=wordcloud))
fig.update_layout(title="Word Cloud in Plotly")
fig.show()
In [6]:
fig.write_image("figure.png")
from IPython.display import Image
Image(filename='figure.png', width=800, height=600)

Observations

  • Flexibility w.r.t Customization: Limited
  • OOTB Support: Yes
  • Code Complexity: Medium
  • Interactive/Static Visualizations: Interactive

Bringing it all together : Comparison Matrix

--

--

Mainak Mitra

Technical leader| AI, Analytics, BI, Data Engineering (Ex Google, Deloitte, Cisco, IBM, Multiple Startups) MIT, Berkley, Stanford, PMP, CSPO certified