You gotta read this!

Thoughts about reading and internet media use



Back in May Mike Hudack posted a rant about the state of the news media. The gist of it is: here we are in 2014, the Internet is at scale — the mobile internet is in the pockets of 30%+ of adults worldwide and social networks are at a proportionate scale and yet the news media seems to be becoming more and more dumb. Put another way: the world of news creation and access have been blown open and yet most news organizations have hollowed out their news capabilities and are posting the trivial listicles about “28 young couples you should know”. The response was interesting — one reason is Mike works at Facebook. Alexis Madrigal summed up much of the sentiment in a sentence in the comments: “Hey, Mike, … My perception is that Facebook is *the* major factor in almost every trend you identified.”

A month later — here in New York our extended spring was rolling onward and on Sunday, June 8th, the University of Reading in the UK put out a press release saying that the Turing test had been passed for the first time, ever. The media ran with the story — or more accurately reproducing the press release. The headlines were excellent, easily shareable, easily clickable. I for one saw it fly by my stream and thought “wow, milestone passed”, I will share that. The problem was the press release wasn't true, and neither were most of the stories that were published. Fast forward to a month later — right at the end of June the AP announced that they are going to start algorithmically writing stories. Using earnings reports data they are going to let machines “write” business stories.

Lets take a step back and think a bit about what is going on here. We have a dominant social distribution system that favors sharablility — case in point: the Hudack discussion. It is biased towards speed, and that bias is short circuiting fact checking — as the Turing example shows. And in the case of Facebook it’s mediated by algorithms that aren’t transparent. Layer in the economics, the cost, of the creation of this “news” add in the AP announcement and you get a good idea of where this is headed. Algorithmically created news stories, mediated by algorithms, shared by people, people who are barely reading these posts. If we can all just get services like Socialflow to do our sharing — we humans can completely quit this loop.

Maybe this isn't the whole story?

Back to Hudack. Mike made the following comment in response to a question on the post: “Is Facebook helping or hurting? I don’t honestly know. You guys are right to point out that Facebook sends a lot of traffic to shitty listicles. But the relationship is tautological, isn’t it? People produce shitty listicles because they’re able to get people to click on them. People click on them so people produce shitty listicles.” On the surface this sounds reasonable, but too often I hear these tautological arguments in media — we do this because of that and if that were different we would act differently. Its worth testing this with a bit of data. Specifically, there are two questions I wanted to test using our datasets at betaworks: Are people reading less? And is the logical conclusion of social distribution that every bit of content should be nasty, brutish and short.

You are not going to read this …

Back in February the Verge ran a post titled “You’re not going to read this. But you’ll probably share it anyway”. The post is based on data from Chartbeat and Upworthy. They discuss and show the relationship between the time spent reading and the sharing of those same articles. The question they were asking was: are people reading articles they share? I am going to opt to show a chart from Upworthy annotated by a Verge commenter - all the interpretation is done in the annotations.

Chart showing the relationship between reading time and sharing. Annotated on the Verge post by “Jruhlman09".

You get it. On lefthand side — the hill of “OMG Dat” — we are in skimming land. Or barely. Chartbeat looked at “user behavior across 2 billion visits across the web over the course of a month and found that most people who click don’t read. In fact, a stunning 55% spent fewer than 15 seconds actively on a page.” The Turing meme was a good example of this: easy to share, no need to read in detail, makes you look smart (maybe) etc. etc. Ok, so we are in a time of high velocity, real time sharing — almost akin to high velocity trading. And since publishers — news organizations — are thinking that we aren’t reading, many of them focus on simple topline metrics like uv’s, page views, shares and likes and don’t track or use engagement metrics to manage their business. Another bad loop.

Take a look at another piece of analysis from Chartbeat. Last summer the team did an analysis of the words in the headlines that drove the most vs. the least reads per click — a read is defined as the user completed the majority of the article. You see how publishers creating the OMG DAT headlines to feed that hill in the chart might be forgoing the right hand hump.

In summary, Chartbeat looked at 10,000 articles that received a lot of social sharing — and they found “there is no relationship whatsoever between the amount a piece of content is shared and the amount of attention an average reader will give that content.”

But what about the “Wow” hill on the right hand side of the curve — beyond the valley of “Meh”. The hill of Wow is significant. Take a look at the x-axis scale — most of the up tick on the right end of the curve is above 100%, that means that not only did those people finish the article but they spent more time on it than the average length. They really read it. Ok s0 people are reading — they just aren’t reading everything, some posts they rapidly (rabidly?) share and others they read. So the next question that’s worth asking is what are they reading and is the rate that they are reading — ie: completing posts — increasing or decreasing?

The Wow section of the Curve — some contrary evidence:

We have lots of data about reading at betaworks. It’s something that fascinates me and we have the privilege to have developed some great products that relate to reading and analytics around reading. One is Instapaper. Instapaper is a mobile, read later tool. You install the extension or bookmarklet on your browser and when you want to read something later you “Instapaper” it or save it to read later. And when you open the Instapaper app on your device all your articles are there to read. Instapaper has a very loyal and active users base and the signal is very clean in regards to a users intent to actually read something. Since by saving it to Instapaper users aren’t expressing anything socially or publicly — its a simple, private intent to save and then read.

We ran our analysis over the past year. And looked at millions of Instapaper reads — “a read” is defined as when people completed more than 75% of the article. We then focussed on the domains that people were spending time on, ie: reading vs. skimming. We then correlated the data to changes in daily active users, to make sure that changes in reading wasn't actually a change in the use of Instapaper (due to a new feature or product release). What we saw is interesting. Reads are increasing over time for all domains and for some domains they are increasing a lot.

Instapaper: Percentage increase in reads, defined as a user of Instapaper completing more than 75% of an article. Note the percent increase for Atlantic & Medium is skewed by the fact that they were significantly smaller at the start of the period vs. the NYTimes or Guardian.

The most significant increase in reads are for the Atlantic, Medium, the New York Times, the Guardian and Slate. And power users of Instapaper are reading more and more each month — on average the read rate has increased for them by ~26% every month. For regular users it has increased as well — albeit at a slower rate (19.2 % over the past 7 months). You could argue that this is because either publishers or the Instapaper product have improved but there isn’t a clear correlation between major product updates and these increase in reads. This suggests that the Wow hill of the curve is increasing, ie: some people are reading more, not less.


Nasty, brutish and short?

What about sharing and the real time nature of sharing? There is an obvious real time component to sharing — we share things and expect the results to be immediate — the gestures we use, retweets, likes, mentions, favorites are all generate little doses of dopamine driven affirmation. But countering this trend we are seeing long form media of the kind we couldn't have imagined ten years ago — in both print, audio and video — and we are seeing slower sharing cycles than I would have expected.

Lets step up to the media landscape in general and start with TV. While its tempting to view the internet in isolation the attention shifts we see online are evident in other media. Over the past five years we have seen an incredible resurgence of storytelling on television: Game of Thrones, House of Cards, Homeland, Mad Men, Breaking Bad, Downton Abbey … its a long list. And its amazingly good, long form media, with character development of the kind that you can’t do even in film. Many predicted that with the rise of internet distribution and collapsing windows of distribution, we would see shorter, cheaper programming — not higher-quality, long-form media. Instead, we are seeing both. Technically we can’t wire up Chartbeat to the cable / multichannel TV platform — but if we could I suspect we would see the same hill, valley, hill mapping. The expanded availability of scaled PPV, the combination of Internet and subscription models has unshackled long-form storytelling on television from the chains of a business model that depended on syndication and restricted the availability of archived content in order to sell network advertising in the future.

…“Last Week Tonight” defies nearly all current norms. The show surrounds soundbites with exposition, rather than letting video stand as the sole element of a segment. It trusts the attention span of its audience, believing a viewership constantly distracted by smartphones and mobile alerts will hang in there for the duration of a story, so long as it is compelling and informative. And it believes people will keep watching even if they might walk away feeling uneasy or unsettled by the issues presented each week despite the many jokes and laughs that are also delivered. (Variety, last week on How John Oliver and HBO Shattered TV’s Comedy-News Format)

And while some people argue that longer form is device dependent the data suggests that while device matters they aren’t decisive. Take a look at John Oliver’s page of Youtube videos, most of them are 10-20 mins long and the view counts have steadily marched into the millions. This viewing is happening on phones, desktops, laptops and tablets. Turning back to the internet — but sticking with video for a bit — we see a lot of this longer form viewing happening on Facebook.

Mary Meeker in her 2014 report cited “the social referral half-life of a tweet is 6.5 hours, a Facebook post: 9 hours”. For videos — in particular videos shared on Facebook — we see very different cycles. Videos seem to slowly make their way across the Facebook audience — often taking 2.5 — 3 days to move through the network. We have seen a lot of examples of this. Below is a detailed break down of one such example.

Digg posted a video in late May titled “This Dance Routine Is Unreal”. The video got a couple of million views — pasted below is the Chartbeat traffic snapshot of the video landing page on Digg. The x-axis here is time, the y-axis is concurrent users. The video was posted on a Friday night — from Friday to Sunday traffic was uneventful. Then, three days after it was posted this video started to pick up momentum via social sharing, pretty much all inside of Facebook — the traffic lasted a whole week.

Example of slow sharing of videos in the Facebook newsfeed. Social sharing is indicated by purple in the chart, yellow is direct referral traffic.

Headlines aside, the data say that some people are reading and reading more — and similarly, some people are watching and watching longer form media. And that longer form media is moving slowly across the social networks of distribution. Why? And what should we do about it?

It seems like we have two opposing trends going on simultaneously. On one end of the curve rabid sharing is driving an attention cycle of seconds but on the other end people are reading and watching more. The social web has flattened web sites and made the home page irrelevant to many sites — simultaneously the shift to the phone/tablet and the mobile app internet is unbundling the web that we knew. The combination of these two trends is changing media and how we use and experience it.

It’s a complex world we are creating. We saw this in the homescreen work we did earlier this year and we see it in the Chartbeat, Upworthy and Instapaper data above. And beneath this somewhat toxic mix of speed sharing and skimming there is an undercurrent of longer form media use. At one point in the comments on his post, Mike Hudack says that he is really focussed on the “general degradation of real reporting.” His concern is valid and its exacerbated by a singular narrative that the world is moving one direction.

It’s not. People, organizations — that are focussing on the right hand side of the curve are finding engagement at a level that couldn't have been imagined a few years ago. From Medium to Upworthy, from the NYT to John Oliver, the Atlantic to Buzzfeed’s long form articles — there is attention and reading and watching going on in ways that most people could not have expected. As Suman puts it “the landscape of media content diffusion (sharing) is a hill-valley-hill of attention, and you’d probably do better sitting on the right hand hill. People sitting on the left hill appear to be more visible, but there are people on the right hill too. And the latter is growing.” And people who are focussed on building around this second hill are going to end up with stronger businesses.

In closing. Mike, note to Mike. I will bet you a dollar that there are are “29 young couples you should know”, not 28, if you worked on the news feed you would know this.

With thanks to Suman, Brian and the Instapaper and Chartbeat teams.