Proof of Concept: Reach does not equal Readership

9 min readNov 14, 2022

In God we trust, all others must bring data.

— W. Edwards Deming

When it comes to the sheer volume, speed and financial impacts of breaking news, the Wall Street desks operated by Dow Jones, Reuters and Bloomberg were far and away the most challenging pressure cookers in journalism.

When I sat on the Reuters hot seat desk, corporate press releases streamed out of dot-matrix printers connected to reams of continuous-feed paper stacked four feet high. Headlines were literally ripped from the press releases, and within seconds flashed up on the ticker scrolling continuously over the New York Stock Exchange trading floor.

At any given moment, those flash headlines moved stock prices by hundreds of millions of dollars as investors made snap buy and sell decisions. And behind those investment decisions were longer-term market bets about the impact of that news on customer acquisition and attrition, earnings, multiples and enterprise valuation.

Which begs the question: If news can inflict material financial impacts on national companies and leading brands, should companies covered by the press have reliable ways to monitor, measure, manage, and at times mitigate long-tail news impacts?

Of course, they should. But to a great extent, they don’t.

Why? Because for decades, as we saw in Turn down the Volume and Tone Deaf, PR has been relying on the volume and tonality of published news coverage to impute the reach and resonance of that news with people. And many monitoring firms, particularly in social media, will take that a step further, imputing brand equity from online chatter.

So, as a first step in reimagining business communication, let’s test the fundamental baseline assumption that imputed “reach” equates to readership. For the purposes of this study, the charts below were generated using data from two sources. News “reach,” the estimated readership of news coverage posted to online news sites, was generated by a leading media monitoring firm.

The corresponding news readership data was provided by Memo, a platform for brands to track the readership of news articles published about them. Memo acquires readership data through partnerships with publications, who in turn receive a percentage of platform subscription revenue. Readership is defined as the number of unique visitors to an article page online within the first seven days of publication.

Here’s the proof

As a starting point, we analyzed the daily volume of published news and the validated readership of that news for seven leading national brands over the course of a 12-month period ending on September 30th, 2022. On the charts below, the actual news “reach” and readership data was stripped off, in part to protect the innocent but mainly because they are proprietary.

Correlation scores are included on the second set of charts for those of us who keep score. For the purposes of this blog, though, we also can base statistical significance on an inter-ocular traumatic test — just look for the patterns that hit you right between the eyes.

Here we go.

The first chart below shows the imputed reach of news about a Big Tech firm that is closely covered by the national press. In my charts, I tend to visualize news volume — again, what media monitoring calls reach — as gray columns. It reminds me of newspapers. So what are we looking at here? This is the news stream generated by traditional media monitoring for a Big Tech company over the course of a 12-month period. Each column represents the imputed reach for that news on a given day.

I set the vertical axis on this chart at 250 million. Why is 250 million significant? That’s roughly the entire U.S. adult population. Now check out the tallest spikes. According to the media monitoring estimates, on 33 days over the past year, news “reach” exceeded 250 million potential readers. On 15 of those days, the imputed reach based on media monitoring exceeded 300 million. On the biggest news day, imputed reach exceeded 600 million. According to the media monitoring data, not only did every American adult see news about this company on every one of those days, but for that data point to be accurate, we each saw the news coverage more than once.

As a reminder, we are testing the conventional wisdom that published news “reach”, measured by circulation or page views, equals readership. We are going to perform two tests. The Top Ten test assesses whether the ten stories with the highest published volume also were among the ten most-read stories. Simple enough.

If news volume is indeed a proxy for readership, we also would expect strong correlations between published volume and actual readership, something on the order of r = 0.70 or higher. That’s our second test.

How does imputed reach compare to actual readership? Here are side-by-side charts visualizing imputed reach, on the left, and the number of readers who clicking into the news coverage on the right. What patterns emerge in the visualized data? For one, the spikes — the big news days — are clearly different.

The chart on the left visualizes daily imputed reach. What it really reflects is not so much what stories people read, but the volume of news the press published. In this case, three of the ten largest spikes correspond to company earnings, and three to new product launches. That makes sense. With Big tech companies, beat reporters cover financials and products. That’s the news media’s agenda.

On the chart on the right — I tend to use blue columns to represent people — the tallest spike on the far right corresponds to a quirky story about the company’s employees that went viral. The spike in the center of the chart corresponds to a critical software update. Virtually every day, this ubiquitous brand was mentioned in general news coverage that readers clicked into.

Did reach equal readership? In the Top Ten test, of the ten news days with the highest imputed reach, only two ranked in the top ten in terms of actual readership. Both big stories related to product developments. So here’s the math. Using “reach” to impute readership of the biggest news days — in our analysis, the Top Ten news days in terms of imputed reach — resulted in an an 80% measurement error. The second test, the correlation coefficient comparing daily imputed reach and actual readership was statistically weak (r = 0.41 if you are keeping score).

Can these findings be replicated? Here are side-by-side visualizations of imputed reach and actual readership for a Healthcare company that was involved in the COVID-19 medical response. To give you a sense of scale, the largest spike on the far left corresponds to imputed reach at more than 1.3 billion — that’s billion — on a single day. In terms of actual readership, that spike corresponded to the 14th most-read coverage day of the year. On the right, the story with the highest actual readership ranked 7th in terms of news volume. In this case, measurement error in our Top Ten test was 70%. The daily correlation coefficient came in at r = 0.65. Better, but still not statistically strong enough to be reliable business metric.

Finally, here’s a scorecard for all six national brands. Again, for this proof of concept study we tracked seven brands — six major national companies across six sectors, along with a nationally ranked university. The blue spikes visualize actual readership against a grey background of imputed reach of daily news coverage.

As you can see from the chart below, the Top Ten error rate ranged from 20% to 90%. The time-series correlations between imputed reach and actual readership for the six companies ranged from r = 0.40 for the grocery chain to 0.68 for our consulting services firm. With an r = 0.70 threshold, the consulting firm nearly cleared the bar, while the remaining brands fell well short.

Confidently, we can conclude that imputed reach tracked by traditional media monitoring and actual readership tracked by Memo click-throughs into news coverage are two distinct metrics.

Let’s drill down into that Top Ten error rate for a moment. Best-case scenario, for the cable network, of the ten stories that generated the most news volume, eight were also the most widely read. On the other end of the spectrum, only one of the 10 stories with the highest volume ranked as a Top 10 read. Put another way, volume data triggered false positive readings at least 20% of the time for the cable network, and as often as 90% of the time for the grocery chain.

Why does that matter? In later blogs, we will start assessing news in financial and economic terms. We will see that, for large companies, the news narrative is different than the consumer narrative. And in the most severe cases — what typically would be called crisis events — news narratives are always the catalysts that trigger business shocks. Because we can now measure news catalysts with a reliable, validated and replicated data point — daily news readership — we will be better able to identify and forecast the severity and duration of business shocks, both negative and positive.

But we are getting ahead of ourselves. Shifting back to our proof of concept, we also analyzed imputed reach and actual readership for a nationally ranked university. Similar results. The correlation was weak (r = 0.39) and measurement error on the Top Ten test was 40%. Based on these findings, there is no reason to think our baseline communication metrics — tracking published news coverage to identify the news narratives, and gauging actual readership to isolate consumer engagement in the news — won’t work for every company, cause and campaign covered by the press.

A three-dimensional framework

Bottom line, our proof of concept validated a working assumption that news that is published and news that people read are two separate and distinct variables in the media equation.

We need to know what was published about the company in the press. No question. But to begin to better understand news impacts, we need reliable, validated and replicated data gauging readership based on the number of people who actually clicked into news coverage online.

In short, to generate actionable insights, we need to measure both news that was published and news people actually read. Based on our findings, we can now begin to forge a three-dimensional baseline for monitoring and measuring the media.

The first dimension tracks published media that mentions a company, brand or product. Most companies have that capability in place now, and are capturing headlines and news coverage on a daily basis. In economic terms, that media tracking will begin to provide insights about the news narratives.

The second dimension gauges news that actually reached people. Memo’s readership data provides a statistically discrete measurement of click-throughs on news stories. People click into stories after reading the headlines. That’s real-world engagement with the news coverage. So instead of assuming readership, we now have a valid barometer of actual news reaching actual people.

The third dimension at this point is — quite simply — time. News readership needs to be tracked on a daily basis. Why? Because we ultimately will align news outputs to business outcomes, and time-series data is a coin of the realm. News, as we have seen, is an external variable impacting customer acquisition and attrition. And now we have reliable, replicated and validated readership data to factor into the equation.

Admittedly, this blog has been a lot to process. But this is just the beginning. In order to elevate the tactical and strategic capabilities of business communication, we need to also know how news shapes perceptions and moves the needle on customer acquisition and attrition.

Anyone paying Attention? What’s all the Buzz about? That’s coming next.

If you work in business or corporate communication, communication research or risk at a major company, or are an academic who wants to dive deeper into this data, let’s connect.

There is good work to be done.

Jim.Pierpoint@HeadlineRisk.com

Proof of Concept: Reach does not equal Readership

Written by Jim Pierpoint