A Better Way to Identify High-Value Content

Derek Gleason
Nov 13, 2017 · 14 min read

Definitions of “high-value” content usually rely on four metrics: website visits, social shares, backlinks, and comments. These metrics, however, live in a vacuum.

That limitation can yield competitive research or content audits that draw few conclusions beyond, “These posts earned the most links” or “Those articles got the most shares.”

Worse still, the content atop research tools tends to identify exceptional websites rather than exceptional content.

Showing a client that Vogue, Men’s Fitness, and Glamour own some of the most-shared content on “winter coat trends” speaks mainly to the power of those sites’ brands and distribution networks — not the quality of the content.

Indeed, how many shares would that same content earn on another website? Even if the content quality were high, it would be difficult to parse the influence of the two.

Further, those posts could be the least-shared articles on Vogue or Glamour that week, month, or year, but they would still top content lists. That ambiguity can lead to recommendations as misguided as suggesting Snakes on a Plane holds the clues to replicating Samuel L. Jackson’s $200 million career.

Rather than simply identifying the most viewed, shared, commented, or linked content, it’s far more instructive to identify the content that built — or continues to build — an online brand.

To do that — to identify real high-value content — you need to measure which articles outperformed each site’s average at that point in time.

This is how to do it.

Using Change-Point Analysis to Identify High-Value, Brand-Building Content

Change-point analysis is a statistical method to identify key shifts in linear data. (See Appendix A for the technical details.)

Change-point analysis can help answer questions like: Did a change occur? How many changes? When did they occur? How certain are we?

In a change-point analysis chart, the data between the minimum and maximum values comprises a critical growth period. During that time, successive data points consistently outperformed the preceding values.

For content marketers, the growth period represents the key brand-building time for a website — when content consistently led to more views, shares, links, or comments.

In absolute terms, the growth-period content may not be the most viewed, shared, linked, or commented upon in site history. For example, a site may currently average 50 comments per post due to an audience built in prior years with posts that earned 10 then 20 then 30 comments.

From a research perspective, the content that created the 50-comment community is more valuable than the content that now exploits it.

This is even truer if a site that once averaged 50 comments now receives only 40.

Change-point analysis is especially useful to dampen the impact of high-performing outliers that fail to spark an enduring trend. It also rewards a subtle undercurrent of progress, like a small but steady stream of comments that flows in without a “viral” catalyst.

A granular analysis of the most successful content during the growth period reveals the themes and tactics that played the greatest role in building an online brand.

That content is a better framework for developing a marketing plan that mimics the strategies that grow a website — not those employed after a website has already grown large.


Case Study: Audience Engagement in SEO Blogs

Here’s the type of question change-point analysis can answer:

What content has built engaged communities in the search engine optimization (SEO) industry?

SEO professionals have a lot to gain from an engaged online community. It’s difficult to differentiate an SEO agency based on website copy, which typically vacillates between platitudes any agency can (and does) claim and technical jargon unfamiliar to outsiders.

A successful blog, then, is often the most influential marketing effort. It has the potential to earn industry-wide recognition that, in turn, attracts clients who use industry notoriety as a proxy for agency quality.

There are many ways to measure website engagement, all of them imperfect. This example assesses blog comments alone. As a single metric, comments have a bit more grip than traffic or social shares, and they’re also tied loosely to the publication date — most comments flow to articles within a few days or weeks of publication.

Posts that perform well in organic search may function in the inverse, gaining comments only after months of progress toward Page 1. That’s a limitation of this simplistic approach.

Other limitations include the potential for promotion strategies to drive traffic (and comments) to certain posts; website reorganization that muddles historical comment data; incomplete site crawls or data extraction; and offline factors, such as speaking engagements, that may influence engagement more than the content itself.

Process

I made a semi-random selection of twenty-five blogs from well-known SEO-focused agencies and websites. Here’s the list:

Aleyda Solis, Annielytics, Blind Five Year Old, Bruce Clay, Builtvisible, Distilled, Ghergich & Co., G-Squared Interactive (Glenn Gabe), IloveSEO.net (Gianluca Fiorelli), iPullRank, Local SEO Guide, LunaMetrics, Marie Haynes Consulting, Moz, Portent, Search Engine Roundtable, Seer Interactive, SEO by the Sea, SEO Smarty (Ann Smarty), SEO Theory, Sterling Sky, Stone Temple Consulting, Sugarrae (Rae [Hoffman] Dolan), The SEM Post, and ViperChill.

I avoided SEO software providers (with the exception of Moz) because adoption of the tool was a potential confounder for community engagement. I also included two media sites — The SEM Post and Search Engine Roundtable — because they tend to cover the same topics as many SEO agencies.

I crawled each site’s blog and pulled comment counts and publication dates for every post with a comments section. The result was more than 30,000 posts spanning 13 years.

I performed a change-point analysis on all posts, plotted a chart for each site, and combined the values from all sites into a single sheet. (That data is available in here; for full details of the process, see Appendix B.)

Results

There are plenty of ways to splice this data. I cover only a few high-level opportunities here.

Still, a brief look at the content that rose to the top of the change-point analysis goes much further than traditional content research to answer questions clients actually care about.

Here are five questions that change-point analysis answered:

1. What content was critical for building audience engagement?

Filter by Growth Period “Yes”; Filter Slope “>0”; Sort Slope highest to lowest

At the highest level, this provided a weighted list of the content that had the largest role in creating an engaged online community. (A non-weighted version of this same list would order posts by total comment count, not slope.)

These were the top 100 posts:

Some posts were less relevant (like those on Google’s 2012 holiday doodle and Moz’s rebranding), but there were other takeaways:

  • Highly commented posts often question Google’s benevolence or the ethics of black-hat SEO tactics.
  • Foundational resources on tactical execution draw questions from users seeking to solve related problems.
  • Google algorithm updates perform well, allowing users to share their, “This is what I’m seeing…” anecdotes (a caveat to algorithm update posts below).

A quick look at a single site, LunaMetrics, shows steady growth for several years, punctuated by a handful of posts that generated more than 100 comments each.

What content has led to their greatest successes? Core technical resources that users — and, presumably, potential clients — need to execute digital marketing strategies.

These highly commented LunaMetrics posts address common tracking issues, social media image sizing, and Facebook page management, among other topics.

2. What content may no longer work as well as it used to?

Filter by Growth Period “Yes”; Filter Slope “>50”; Sort by year

Filtering by year revealed industry-wide trends often hidden by the ebb and flow of a single site’s performance. For example, which topics built online engagement in 2010 or 2011 but failed to do so after 2015?

  • Historically, posts on Google algorithm updates generate hundreds of comments. However, they have been less central to brand-building in recent years, perhaps because major updates like Panda and Penguin have been replaced by smaller, more gradual algorithm changes.
  • Similarly, content that addresses how to avoid or manage Google penalties has played a lesser role in the growth of audience engagement.

For example, in April 2012, Google’s “Penguin” update initiated a growth period for Search Engine Roundtable. Industry watchers sought out the media site to gather the latest information — and share experiences — about the change.

That growth may now prove difficult to replicate, even as G-Squared Interactive has built engagement from more nuanced interpretations of recent updates. Still, the effort has required frequent revisions and a more cautious approach — a 2015 update retains the moniker “Phantom.”

3. Can a single piece of content generate lasting change?

Filter by Publisher; Growth Period, “Yes;” Sort by CUSUM value; Check if most-commented post comes after the lowest CUSUM value

This study uncovered only a single “game-change” post — when the most commented post occurred at the start of the growth period.

A game-change post is strong enough to catalyze a fundamental and long-term shift in community engagement.

Still, high-engagement posts that start a growth period require follow-up content. Otherwise, that engagement will be ephemeral.

  • Portent’s “The Digital Marketing List: 59 Things You Should Be Doing But Probably Aren’t,” was the only game-change post in this case study. It’s a pithy blend of novel and (mildly) controversial recommendations. Nearly a decade old, its age may limit the instructive value.
  • Annielytics’s “Hundreds of Tools for Marketers,” published in 2012, was the second post after the start of the site’s growth period. It has a similar appeal and age-related caveat. The bulk of its value is in a downloadable spreadsheet, which may have made it more useful to readers.

The biggest takeaway? You’re unlikely to build a sustainable brand from a single post: Only one of 25 publishers aligned their most-commented post with the change point that initiated the primary growth period.

4. How are large SEO agencies currently building engagement?

Filter by Publisher, exclude agencies with fewer than 20 employees; filter by Publication Date, Date is after 12/31/2015; filter by Growth Period “Yes”

What recent content has been the most successful for large SEO agencies? This rules out not just one-person shops but also agencies that haven’t experienced their primary growth period in the last couple of years.

Content published by Stone Temple Consulting and Bruce Clay is the best example of what has worked for large agencies in 2016–17. Their primary growth periods, which began in August 2013 and May 2010, respectively, have continued to grow audience engagement through 2017:

Eight of Bruce Clay’s 12 most commented posts were published in 2016–17, leading to a rapid rise in audience engagement (and a nearly vertical line in their CUSUM graph).

For both sites, their most engaging content covers tactical solutions or research on current issues in SEO — like whether links still provide significant value or how to optimize for Google Home voice search.

5. How long does it take to build an engaged audience?

For each publisher; count posts between the first post and the minimum CUSUM value (=MATCH(MIN(D:D),D:D,0))); compile values to obtain mean and median

An understanding of how much work it takes to build an engaged audience is critical to set client expectations. It can also identify shortcuts.

In this study, sites averaged 441 posts before hitting the growth period; the median, a more balanced metric given the publishing frequency of SEO media sites and Moz, was 116.

Based on the median, an SEO agency publishing a post-per-week should expect to see traction — or begin to question its absence — after about two years.

Sites that needed fewer posts to reach their growth period merit further investigation to understand how they earned engagement more efficiently.


Other Uses for Change-Point Analysis

Change-point analysis has potential applications well beyond this limited case study, which serves primarily as a proof of concept.

A more robust measurement of “high-value” content could employ a rudimentary algorithm to weight all key metrics — traffic (by source), shares, backlinks, and comments — and return a hybrid value for analysis. Adding conversion metrics, if available, is a logical choice.

A more in-depth analysis would also benefit from annotation of key events in company life cycles — geographic expansions, prominent guest posts, speaking engagements, and other events that may influence metrics more than the content itself.

There are, of course, uses for change-point analysis throughout digital marketing. All help anchor raw metrics to a specific time and site — telling us what each data point meant right then and there.

For every business, that’s better.


Appendix A: How to Run a Change-Point Analysis

A change-point analysis plots a single variable over time by calculating the cumulative sums (CUSUM) of differences between data points and the average.

The result always starts and ends at 0, the average. The maximum distance from the average — above or below — represents the most significant change point.

A data set may contain multiple change points.

Each change of slope from negative to positive (or vice versa) represents a potential change point. Sharper and longer-lasting changes are more likely to be true change points.

A still-positive trend will have its maximum value (0) at the end of the graph, leading to a V-or U-shaped chart. The former represents a rapid rise and the latter more gradual progress.

Change-Point Analysis in a Spreadsheet

A change-point analysis can be performed easily in a spreadsheet. The base formula follows the pattern below, if cell A2 contains the first metric (e.g. social shares).

The CUSUM values are returned in Column B. Column B is then plotted over time to form a change-point analysis graph.

B1 = 0

B2 = B1 + (A2-AVERAGE(A:A))

B3 = B2 + (A3-AVERAGE(A:A))

B4 = B3 + (A4-AVERAGE(A:A))

This sheet has the change-point analysis formula pre-populated for 1,000 rows.

Bootstrapping to Test for Statistical Significance

A simple macro (I would share mine were it not so inelegant) can layer in a bootstrapping process to determine whether the minimum and maximum values represent statistically significant change points.

Bootstrapping randomly rearranges the data to check whether reordered samples generate a chart with greater amplitude than the data set.

For example, in 1,000 bootstrap samples, if the absolute value of the difference between the maximum and minimum CUSUM values is less than that of the data set in 974 tests, the confidence that a change occurred would be 97.4%.

Bootstrapping is an imperfect method because calculating all potential arrangements, even for small amounts of data, can quickly require an impossibly large number of tests. (You can run a bootstrap test multiple times and average the confidence levels.)

Bootstrapping is less necessary for analyses of thousands of data points. In those instances, the bootstrapping process is likely to return 100% certainty.

Back to the article


Appendix B: The Case Study Process

For each site, I calculated the CUSUM values and ran a bootstrap analysis. Fourteen sites registered a confidence level above 99.9%, and 19 surpassed the threshold of statistical significance (95%).

Those that fell below the threshold generally had few posts and lacked a pattern to website comments.

Plotting the CUSUM values made it easy to identify the key growth period for each publisher.

There were two ways to plot the change-point analysis charts. Plotting by date seemed more intuitive, but the variable time between posts risked distorting the graphs. (See below.)

Counting the time between each post as one provided an alternative chart that better reflected the rate of change from post to post. Neither choice affects the CUSUM value, which uses the publication date only to order the data.

I tagged posts between the minimum and maximum values as part of the growth period for each site. I now had a site-by-site list of the most valuable content for increasing blog comments, my proxy for community engagement.

The total comment count provided an absolute measure of value. That information help set expectations for what a single post could achieve and identified sites that had the largest engaged audience.

For a relative measure, I calculated the slope between each site’s CUSUM values. While CUSUM values were relative to each site, the slope offered an easy metric for cross-site comparison.

Back to the article


Appendix C: CUSUM Charts

Derek Gleason

Written by

Content for CXL. Former internal fan fiction writer at Workshop Digital

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade