DATA STORIES | BRAND REPUTATION | KNIME ANALYTICS PLATFORM
Brand Reputation Measurement using Social Media
Use a codeless approach to track in real-time what your stakeholders have to say about you
Author: Konstantin Pikal
“Your brand is what others say about you when you are not in the room”.
This famous quote, attributed to Jeff Bezos, Amazon’s founder, is one of our favorites. It is not without criticism, but it is an apt way to approach the measurement of branding.
Building a brand is on every marketer’s task description. But how do you do it? And foremost: How do you measure what you have built (and hopefully are still building)?
How do you analyze what your stakeholders (e.g., customers, media, competitors) have to say about you? We could turn our heads and point them towards social media, one of the channels where people “are talking about us”.
The Brand Reputation Tracker
A research team led by Prof. Ronald Rust of the University of Maryland has developed a new way to analyze social media to understand a Brand’s reputation. Different from other famous Brand Metrics, like BAV or Interbrand, this tracker uses real-time social media data and measures the occurrence of three important drivers of Brand reputation: Value, Brand and Relationships. Inspired by Rust et al.’ s work, we will construct an interpretable tracker with a codeless approach using KNIME Analytics Platform. In this example tutorial, we focus on the driver Brand and its sub-drivers: Cool, Exciting, Innovative, and Social Responsibility.
The original Real-Time Brand Reputation Tracker was explained in: Rust, R. T., Rand, W., Huang, M.-H., Stephen, A. T., Brooks, G., & Chabuk, T. (2021). Real-Time Brand Reputation Tracking Using Social Media. Journal of Marketing, 85(4), 21–43. https://doi.org/10.1177/0022242921995173.
Open KNIME and load the workflow — Where to find it
You can find the workflow in the Machine Learning and Marketing Space on the KNIME Community Hub along with other helpful workflows. This article will focus mostly on the “Text Mining Replication of Brand Reputation Tracker” workflow.
What the Brand Reputation workflow does
- Get tweets via Twitter API
- Clean Twitter data
- Prepare the tweets for text-processing
- Tag the text based on Brand Reputation dictionaries
- Calculate Brand Reputation scores
- Visualize Brand Reputation over time
1. Get tweets via Twitter API
First, we need some data to use the brand reputation tracker on. Therefore, we suggest you get developer access on the Twitter API. It sounds complicated, but it really isn’t. It just gives you access to Tweets that you will have to later text-mine.
How to get your Twitter API
Firstly, we need to get access to Twitter data. Luckily enough, Twitter offers an API (Application Programming Interface), where we can get the data from. Now it is free, but it might become a paid service soon. First, you will have to sign-up for a Twitter account. If you already have one, you can skip this step. It will ask you some basic information and you will need to verify your email address. Afterwards, you will be asked to configure a couple of things for your developer account: Your Country (in our case, Italy), your use case (if you are following a course, select “Student”).
Now you will also need to verify your account with a phone number. Make sure you do that. If not, you will not have access to the Developer Tools at Twitter. Finally, it will ask you to agree to the “Developer agreement & policy”.
Set-up the Twitter API
In your Twitter developer portal, you will have to get four things: the API key, the API secret, the Access Token and the Access Token Secret. Those are the credentials that you will need to be able to connect to Twitter and retrieve tweets.
Twitter API Connector node
By right-clicking on the node and clicking on “configure” you will be able to access the configuration window of the node. Please add your personal Twitter credentials (API key, API secret, Access Token and Access Token Secret), as you find them in your Twitter developer account.
Twitter Search node
Next, we will have to get tweets that were written around a certain brand. The way we do it is by using the Twitter handle. In our example, we will use “@amazon”. This gives us access to the tweets that mention Amazon. The number of rows that you can set has to do with the access level of your Twitter API. For example, the Twitter v2 rate limits are 900 Tweets per look-up, after which you will have to wait 15 minutes. In other words, the max amount of tweets that you can get in one go ranges between 15 and 16 thousand. Note that we excluded user profile images, because they would slow down the execution. In case you are interested in profile images, just add it to the fields selection.
2. Clean Twitter data
Before we start analyzing the data, we will have to do some cleaning: or better, the workflow is doing the cleaning for us. It excludes retweets and filters only tweets in English (this is important because our text-mining dictionary is only in English).
3. Prepare Tweets for Text Processing
Now that we have cleaned our data, the most important part begins. We are going to work with tweet texts and extract insights using the KNIME Textprocessing extension.
Preprocessing
We start off with pre-processing. For this, we first must convert Strings (e.g., text on tweets) back into Document data type.
We need the Document data type to be able to perform text-mining operations in KNIME. In the first step, we stem all our words to make them easier to interpret for the machine. For example, “exciting” becomes “excit” and “inspiring” becomes “inspir”. After KNIME has done this for us, our documents are fed into the Dictionary Tagger node.
4. Tag the text based on Brand Reputation dictionaries
The Dictionary Tagger consists of two inputs: a dictionary, including all the relevant words, and a tagger, where we specify what tag applies to which word. For example, “trendi” and “hip” are part of the positive “Cool-dictionary”, whereas “ancient” and “lame” are part of the negative “Cool-dictionary”. The tagger now uses the dictionaries to tag the document in this way. When the tagger finds a word in the document that is also in the dictionary, e.g., “modern”, it tags that word with the corresponding tag (FTB-A).
You might have noticed that in the paragraph above we used a tag type called “FTB” and its values (e.g., A). This is because KNIME does not have a custom tagger for brand reputation drivers. Next, we create a “bag of words”. A bag of words is simply a list of all single words occurring in the dataset.
5. Calculate Brand Reputation scores
We now reconvert our tags to strings, and we only keep the words in our document that have been tagged by our dictionaries. The reason for this is that way we have less data to process. After filtering out the words that do not have any tags using the Row Filter node, we use the TF node to count the occurrences of each term in the document. This will give us a document where we see the count of a specific term in any tweet. If you look closely, you will see that every tag/tweet combination has its own row. This means that if we have two tags in a tweet, we will have two rows. We will later use a Pivoting node to sum up tweets and tags.
To group our data by time (in our case by months, but this depends on the data that you have collected — the workflow on the Hub aggregates data by day and hour), we must extract date and time fields. We manipulate the data in such a way that we end up having different tag frequencies in the columns and time info in the rows.
After this, we also handle any missing values by fixing them to the value “0”.
It is worth mentioning that the column names correspond to the tag values that we used during the dictionary tagging process (e.g., A, ADV, etc.). Therefore, we rename the columns with the names of the brand sub-drivers (cool, exciting, innovative, social responsibility).
Net and average scores
When you look at the table, you will see that there are positive and negative columns for each sub-driver. Using a series of Math Formula nodes, we subtract the negative column from the positive column for each sub-driver. In this way, we obtain the net scores. After that, we take the net scores and average them over the four brand sub-drivers. In this way, we obtain the “Brand Driver” average. If we inspect the output table, we will now have five columns: Cool Net (which is obtained subtracting Cool_Negative from Cool_Positive), Innovative Net, Exciting Net and Soc. Resp Net. As stated before, the Brand Driver is the average of those four attributes.
6. Visualize Brand Reputation over time
Finally, we normalize all our values to make it easier to understand changes in time and across drivers (in case you add new drivers, such as the “relationship driver” and its sub-drivers). To visualize the evolution of the selected sub-drivers over time, we use the Line Plot node. We need to make sure to choose the time dimension on the x-axis and our drivers (depending on how detailed we want our analysis to be) on the y-axis. If you look at our example, you can see that Amazon’s Brand has been perceived as less innovative and exciting throughout the year, while the overall perception of its social responsibility seems to have improved throughout 2022.
Automate brand reputation analysis with no-code
Measuring brand reputation is not a trivial task. While most well-established metrics get the job done, they usually fail to do so in real-time. In this article, we introduced a brand reputation tracker that uses real-time social media data to measure the occurrence and trends of one major Brand reputation driver: Brand and its four sub-drivers, i.e. Cool, Exciting, Innovative, and Social Responsibility.
To ensure a fully automated and transparent process, we relied on KNIME capabilities to connect, search and retrieve Twitter data around a chosen brand without a single line of code. Likewise, the no-code steps to process tweet texts, assign tags and visualize how brand reputation changes over time can be reused and extended conveniently beyond the scope of our example.
So now it’s your turn! Get your Twitter API and start your journey into real-world brand research. We are very curious to see your results!