Freeman Goja
Whatsapp Sentiment Analysis
5 min readDec 26, 2019

--

Mining of social media conversations using Natural Language Processing techniques to extract useful insights from subjective information is fast becoming popular. On a larger scale, businesses use it to understand customers’ sentiments towards their products or services. Marginally, this type of analysis could provide an understanding into the social behaviors and engagement of a small group of people such as a department or class. In this article, you will learn how this powerful analytical approach can be used to gain hidden knowledge about a group of people on the same WhatsApp group.

R has great packages for Statistical and Sentiment Analysis. Packages such as gganimate offer compelling visualizations that are self explanatory and appealing. A full analysis of the WhatsApp group upon which this article is based upon can be found here.

Below is an example of how you can gain useful insights into WhatsApp group’s chat.

Having exported the group’s chats from my phone to my email. I downloaded the text file and imported it into RStudio. In order to protect the identity of the group members, I replaced their names with some pseudo names. As expected the dataset required reasonable cleaning, which was done in three phases. However, I shall be discussing only the relevant cleaning steps for the analysis in this article. The rest of the steps are in the full code on github.

Data cleaning is a major step in every data science project so here, I started by removing the missing values and deleted messages so they do not add to a member’s message count. Another important preprocessing step was feature engineering to extract the hour, day, weekday and month of the chat from the time feature. With this information I was able to visualize the chat distribution over hours, weekdays and months.

One of the things I wanted to find out was how the group has chatted over the months since the WhatsApp group started.

Monthly Distribution of Chats

Over the 6 months period, you can see that some days were far more engaging than others. The chat activity consistently peaked every 2–3 weeks, which might indicate that they probably had highly engaging tasks corresponding to those periods. It can also be seen that every month with the exception of October and December, the group had gone some days without any chat activity.

The essence of forming a social media group is to keep people with a common interest engaged. Therefore, I thought it would be helpful to investigate every member’s contribution to conversations on the platform. The result was as shown below:

Member Contribution to Chat

You can see that out of 39 members, though everyone posted a chat or comment at least once, the top 10 contributed more than 70% of the total chats in the group. Generally, while about a third of the group was very active on the platform, another one third was very passive, contributing significantly less than the rest.

Another interesting thing I thought would be good to know was the daily chat pattern. I wanted to know the group’s chat activity over the hours of the day. The next two graphs will illustrate that.

Static Plot of Chats Per Hour
Animated Plot of Chats Per Hour

The above graphs show that while the group is active around the clock, they understandably chat less between 12:00 midnight and 8:00 AM. Chats suddenly climax at 9:00 AM corresponding to the start of most office hours and stay up with marginal fluctuations till 2:00 PM before dipping and rising with fluctuations until another climax at 9:00 PM. For a member of the group who rarely checks his/her WhatsApp, it would be a good idea to check the chats at those times with the highest activity to catch up on any trending topic of discussion or updates.

Also worthy of note is the chat over days of the week. As they go through the week, chat activity from Monday to Sunday may vary. Perhaps you would like to know the engagement over the weekdays. Below is an illustration of the chats per weekday.

Weekday Distribution of Chats

Some days of the week are more engaging than others and this could be associated with the group’s schedule. Knowing the activity distribution over weekdays could be helpful in targeting when to start a conversation and achieve maximum or minimal contributions. If the group is planning to introduce advertisements on its platform on a particular day of the week, Thursday would make an ideal choice as it appeared to be the busiest day. It can also be seen that weekends are relatively quieter which may suggest that this group’s activities happen mostly on weekdays.

On a lighter note, people love to use emojis for expressions on social media, WhatsApp inclusive. I found everyone’s favorite emoji as shown below.

Most Frequently Used Emojis by Every Member

Now that we know a thing or two about this group, you can use that knowledge to make some decisions concerning the group or do further analysis to gain more insights.

I hope you enjoyed and learnt something from this article, you can find the full code and results on github here.

Linkedin https://linkedin.com/in/freemangoja

--

--