Analyzing the Lebanese Blogosphere

I always liked quantifying things, looking at them from a Numbers perspective. I also believe that good design is a core part of any display of information (be it a pitch deck, an analysis or just a piece of info you want to communicate; Beauty always conveys your message better).

Recently, I got the idea of quantifying and visualizing the Lebanese blogosphere (bear with me).

For the few techies that might read this, I built a little scraper that parsed through the last 10,000 posts on LebaneseBlogs.com and dealt beautifully with its pesky infinite scrolling.

An Overview

I ended up with a data set of over 10k posts with some info related to them (data spans from mid-Oct 2014 to mid-June 2015). I managed to extract the “Virality Score” that LB’s cool algorithm calculates as well as each post’s title, publishing date (month only) and blog name.

Since an Excel sheet with 10k+ rows doesn't really make sense, I used Tableau (absolutely the best Data viz tool) to beautify all I collected. I ended up with 3 visualizations (Vizs) I thought were insightful which I will go over and share below.

Beautiful Data

Most Viral Posts for Lebanese Blogs

I first thought about visualizing all the Lebanese blogosphere and representing each blog by a circle whose size depends on the Cumulative Virality achieved during that time.

Meaning that blogs pushing the highest number of relatively viral posts would get the biggest representation.

Also, inside each circle is the name of the blog and its Average virality score (over 50).

Here’s the link to the actual viz that is very interactive and dynamic, I really advise you to check it out. Would've loved to embed it, but Medium…

Most Viral Posts for Lebanese Blogs

Highest Virality avg. for Lebanese Blogs

Summing all the post-level virality scores gives us an aggregate picture and ignores individual performances.

So, for my second viz, I thought of representing the Average Virality for each blog in the set. I ended up with a curve that resembles Mustapha’s ideal Logarithmic function for Virality scoring.

Here, each blog represents an entry on the X axis and I used the Y axis to represent average post virality (simply calculated by summing post-level virality scores and dividing by number of posts).

Highest Virality avg. for Lebanese Blogs

Visualizing both Metrics

By now you're probably thinking that both metrics are needed to correctly assess a blog. One measures a blog’s ability to publish a high number of successful posts, while the other looks at the blogger’s ability to write highly sharable, high-quality posts (whatever that may mean).

Lucky for us, Tableau is an INSANE piece of software.

My third Viz puts both these metrics into perspective for each blog. On the X axis is the Cumulative Virality and on the Y axis is the Avg. Virality. Each blog is represented by a circle in that matrix.

I’ll let each one of you interpret the data as you see fit, as each person would have a different definition of what a successful blog is. But one thing that I’m sure about is that

Data Does Not Lie.

Wrap-up

I'll be sharing more visualizations relating to this Data set and a few other blogs that I scrapped/will be scraping.

There are a few things that could make these vizs even more insightful, but unfortunately, I can't scrape this data. For example if I had access to the blog’s categories, I could segment all vizs by category and look at their monthly trends.

I'd be more than happy to share with you the tools I used and even the data set itself if you'd be interested. And if you have any idea about a cool graph or an interesting data source I could scrape, feel free to reach out to me on Twitter anytime.

All this data is publically available, people just have to know where to look and how to get it (at scale/programmatically).

Like what you read? Give Sandro Jazzar a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.