Visualization of customer feedback(reasons behind feedback) — Word Cloud

Published in

IBM Data Science in Practice

3 min readApr 22, 2019

Problem Statement: We collect feedback from customers and do analysis of it, and post analysis, we conclude by saying, we received x% positive feedback, y% as negative feedback, etc.

What we wish to understand are the reasons behind positive and any other feedback.

This article focuses on the visualization of feedback for customers giving negative reviews.

On technology front, I will be using the “wordcloud” Python library and using airline industry dataset from Kaggle.

For airline industry use case, the common feedback are flight delay, baggage issue, hospitality etc.

For one airline provider, delay may be the major issue and for other it can be baggage issue. Here, we try to represent the reasons in the form of word cloud.

Sample Data shown in image below

screenshot of a Jupyter notebook looking at the airline dataset

As usual, data is pre-processed before applying analytics. I will skip data pre-processing details and focus on word cloud generation.

To create word cloud in Python, we need few lines of code:

from wordcloud import WordCloudwc = WordCloud(stopwords=stop_words, background_color="white", colormap="Dark2",max_font_size=150, random_state=42)wc.generate(data_nouns_adj.reviews[c])

I have picked up the words which represent nouns and adjectives expressed in customer reviews. Let us generate word cloud using such words as shown below.

import matplotlib.pyplot as pltplt.rcParams[‘figure.figsize’] = [16, 6]airline_name = [‘virginamerica’, ‘united’, ‘southwest’, ‘delta’, ‘usairways’]#print(data.columns) # *** from dtm pickle file
#print(data_clean)# Create subplots for each comedian
for index, c in enumerate(data.columns):
 #print(c)
 wc.generate(data_nouns_adj.reviews[c])
 
 plt.subplot(3, 4, index+1)
 plt.imshow(wc, interpolation=”bilinear”)
 plt.axis(“off”)
 plt.title(airline_name[index])
 
plt.show()

set of word clouds from each airline from the dataset

It did not give right sense as required for my use case. I wanted to understand the main pain points shared by customers.

To find the next trail, I picked the verbs instead and generated a word cloud. This gives me better insights of customer reviews. As shown in below image, for few airlines, cancellation of flight is the main pain point, for few airlines, delay is the paint point.

(We can club delayed and waiting keywords as both words that tend to mean same).


import matplotlib.pyplot as pltplt.rcParams[‘figure.figsize’] = [16, 7]airline_name = [‘virginamerica’, ‘united’, ‘southwest’, ‘delta’, ‘usairways’]print(data.columns) # *** from dtm pickle file
print(data_clean)# Create subplots for each comedian
for index, c in enumerate(data.columns):
 print(c)
 wc.generate(data_verb.reviews[c])
 
 plt.subplot(3, 4, index+1)
 plt.imshow(wc, interpolation=”bilinear”)
 plt.axis(“off”)
 plt.title(airline_name[index])
 
plt.show()

set of generated word clouds per airline in the dataset

The word cloud is a good visualization technique to understand text data. In this technique the size of each word indicates its frequency and significance.

Business Use Case for Word Cloud

1. Finding customer pain points — and opportunities to connect

2. Understanding how your employees feel about your company

3. Identifying new SEO terms to target

The more details can be found below

https://www.boostlabs.com/what-are-word-clouds-value-simple-visualizations/

IBM Code Content for Developers

Home

Let's code something amazing. More than 100 open source programs, a library of knowledge resources, Developer Advocates…

developer.ibm.com

My other article

sms analysis to extract offers given by merchandise

Visualization of customer feedback(reasons behind feedback) — Word Cloud

IBM Code Content for Developers

Home

Let's code something amazing. More than 100 open source programs, a library of knowledge resources, Developer Advocates…

Written by Rajesh Gudikoti