Exploratory Data Analysis on Facebook Page Analytics

An example of using Tukey’s EDA Method

J.Lo
Information Visualization

--

“The approach of exploratory data analysis is described as being detective in character. It is a search for clues. Some of the clues may be misleading, but some will lead to discoveries.” — R. Church’s Review on John W. Tukey’s EDA.

Exploratory data analysis is an approach to data analysis by applying a variety of largely graphical techniques to discover insights from data. Tukey’s approach is highly visual and utilizes the various strengths of graphs: the ability to store quantitative data, communicate findings and discover new information. It is a powerful method and I will demonstrate an example of how EDA can be used.

Data Source: Facebook page data

I have selected this data set because the data is fairly complex with various metrics and also because of my previous work experience. I understand that a lot of marketers are typically obsessed over the magical numbers, such as ‘number of likes’ and ‘number of followers.’ I would like to use this opportunity to break down some of these misconceptions through the process of EDA.

Tools: Tableau

I imported the data into a single table and conducted my analysis on Tableau.

Process: Hypothesis Statements

Create a set of hypothesis statements that are specific but also open-ended enough to provide you direction on your exploration.

Hypothesis 1: The number of reach is mostly obtained through paid advertising.

I noted down metrics that can be utilized to address this problem: Total reach, organic reach, paid reach and reach by people who like your page.

Total reach: Counts the number of unique people who saw your posts, regardless of where they saw it. If your post reaches a person organically and through an ad, that person will count as one for organic reach, one for paid reach and one for total reach.

  1. Overall Trend
    (Top Left) I started with a simple overall line chart to provide me an overall idea of the number of reach over time. This was done with a simple line chart with the sum of posts over time.
  2. Adding another metric
    (Bottom Left) Afterwards, I decided to include a line chart on the sum of reach over time.

Interestingly, I saw that on a particular week (Sept 1, 2013) a release of a around 19 posts garnered an audience of around 37,004 users. The following week, the Facebook page created 42 posts, but their reach was very low, at around 5,149 users. I began questioning whether the spike in reach for that particular week might be caused by any factors; I went on to investigate whether posts during that week was paid or organic.

Dashboard to address my first hypothesis.

Organic reach: The number of unique people who saw your post in News Feed or on your Page, including people who saw it from a story shared by a friend when they liked, commented on or shared your post, answered a question or responded to an event.
Paid reach: The number of unique people who saw your post through an ad.

3. Dividing it Up
(Top Right) I combined all posts broken down by Total reach, organic reach, paid reach and reach by people who like your page. Through that chart, I was able to see that the same week where there was a massive spike in reach happened at the same time when the user started to use ‘paid advertising’ to promote their posts. For that reason, the pink line overtook the blue and the purple, showing a massive spike in value.

But, what kind of content was able to achieve such an effect? Even though advertising to promote must’ve definitely helped, I still believed that the content must be fairly decent.

4. Extending Your Question
(Bottom Right) I selected message type and reach as my metrics in this chart. I wanted to understand how much of the Paid vs. Organic content was a link, photos, share, status update or video content. It turns out that they’re mostly linked content, so let’s find out more.

Hypothesis 2: Dynamic multimedia content attracts the most number of engagement from users.

My 2nd hypothesis is based on an assumption that people tend to be attracted to interactive content and more willing to ‘engage’ such content by clicking, liking or commenting.

Engagement: People engaged is the number of unique people who’ve clicked, liked, commented on or shared your Page posts.
Impressions: the number of times a post from your Page is displayed, whether the post is clicked or not. People may see multiple impressions of the same post. For example, someone might see a Page update in News Feed once, and then a second time if their friend shares it.
Reach: the number of unique people who received impressions of a Page post. The reach number might be less than the impressions number since one person can see multiple impressions.

  1. Variables
    I used ‘Lifetime of Engaged Users’ and ‘Post Message.’ I color coded the values by the message type. This chart is to provide an overall idea of what are the most popular messages with the top engagement scores.
  2. Content
    The post that had the most engagement said: “Visualizing the countries that request the most information from Facebook. Get the word out…share, tweet, re-tweet.” It was a message categorized as a ‘share,’ which proved my 2nd hypothesis to be wrong. To understand why, I tried to investigate further by unpacking the meaning of engagement.
Number of Engaged Users for each post. Color coded by message type.

3. Unpacking Metrics — this can only get worse
What is Engagement? It measures the number of users who clicked anywhere in the post and that users are unique visitors.
What is ‘Lifetime Talking about This’? The number of unique people who created a story by interacting with your Page post and that users are unique visitors. On Facebook, they mentioned that “Lifetime Talking about This” is to indicate how popular is your content and that it can be used to identify trends.

Lifetime Talking about This vs. Number of Engaged Users

In our example, we can see that the highest “Lifetime talking about this” score is for a post message that is a link. Whereas, the post with the highest score for engagement receives a lower “Lifetime talking about this” metric.

Conclusion:
I can see how marketers can use this metric to help discover how some users are recycling content rather than directly engaging it on the post.
This figure tells you how popular your content is with your Fans.

Evaluate it and see if there’s a trend appearing!

If your ‘Talking About’ figure is decreasing, then you can compare this against your content plan (make sure you have one!) and see if there’s anything you’ve changed recently. For example, you might have started posting videos which aren’t getting much engagement compared to the photos you used to post, which could indicate that it’s time to ditch the videos…

--

--