Authorship, value and the media’s lost profits

No one writes anything on the internet.

HTML does not have a tag for bylines. The web’s fundamental technology has tags for six different headers, figure captions, navigation and asides. It does not have a tag for authorship. HTML5, designed for “allowing you to describe more precisely what your content is,” cannot describe who wrote this article.

Authorship is a major indicator of quality and transparency. Without it built into the structure of pages good writing may disappear from the web, because we don’t know how to value the people who write.

HTML5 has tags for six different headers, figure captions, navigation and asides. It does not have a tag for authorship

HTML5 came out in 2014. After 23 years of HTML you can use it inside your writing to show when a piece of content cites another or you’ve abbreviated something; but, as far as the base language of the web is concerned, you can’t claim you’ve written anything.

This isn’t unusual.

A brief history of a world without authors

The common method for designating an author of an article shared on Facebook, Twitter, Google and LinkedIn is undocumented on those platforms. The `name=author` meta tag which is checked by most social sharing sites to determine authorship isn’t used on many news sites or content publishing systems — including this one. Though the meta ‘author’ tag was used as an example in the HTML4 specification, it doesn't appear to have been formalized until some time in 2013. Even now, the format is unclear and, after a failed 2011 attempt by Google, unique identification of authors is impossible. A proposed `rel=author` property remains. It is mostly unused, barely documented and manages to confuse existing recommendations.

So badly understood is the meta author tag that even Reuters didn’t use it properly until sometime after early 2015. Many sites still don’t use it properly. Plenty don’t have an author meta tag at all. If you’re a journalist writing on Medium, or any number of other sites, your work is, on a technical level, uncredited.

Even with the meta author tag, there is no commonly accepted structural way to designate multiple authors, or to show different authorship on different articles on the same page (say in the very common case of a list of news articles).

We’re missing out on uncounted opportunities for a better web and a more profitable and independent media industry. This is a big one.

We’ve gotten close to in-document structural authorship with the ‘hcard’ microformat. The workaround has been decreasingly used on sites. It is even more vanishingly parsed by the bots that read, understand and assign value to web pages. Until recently the most common authorship management plugin used by WordPress-based news sites didn’t even fully support the hcard.

Different worlds

This isn’t meant as a criticism of engineers involved in any of these projects. the problem is a cultural mismatch. People who write code use mechanisms for tracking and management that bake in ownership of work. Services like Git make it obvious who wrote what and when in a block of code. Engineers have few concerns about structured tools for designating authorship because in their realm it is crystal clear.

For the creators of digital writing the assignment of credit is far less obvious.

Missed value and lost profit

The media industry has been very willing to pass responsibility for making money and making the platforms that host content. These jobs go to external companies or internal, but separate, divisions. By leaving newsrooms out of decision making for building tools to monetize journalism we’re letting people who don’t understand the value of the work set a technical and financial agenda. We’re missing out on uncounted opportunities for a better web and a more profitable and independent media industry. This is a big one.

It is hard to understand the value of writing a work or how much an author should earn without a consistent technical method for attaching them to that work. The few that exist are used inconsistently and irregularly.

Without a formal method to understand contributions of creators, it is inevitable that tools used to read content are valued higher than the organizations who create that content; even though a news reader app couldn't exist without many news writers.

There’s a reason why the first thing you see on a Facebook post is the user’s face.

You don’t have to take my word that the designation of authorship has value. Attempts to find and show value in bylines may be hobbled, but they exist. Google and Facebook have both attempted to own unique authorship IDs. A clear sign that these things have value. In my own experiments, the basic byline on a social media share improves click-throughs. Even better when that name is accompanied by a face.

When dealing with web design, I’ve found that moving the byline from the bottom to the top increases engagement, with more users going to the author page. The same with adding a photo to the byline and putting both photo and byline near an article’s title on the front page. These are badly needed signals to readers of authenticity and transparency.

It isn’t just about making your journalism feel more trustworthy. Units for internally converting users to more content seem to have greater success when making authors a significant factor in their suggestions.

Though writers’ credit is a factor more often valued in the newsroom than the sales floor, this may be more an aspect of our failure to understand authorship’s wider potential in the market. Author information is sometimes fed into ad units on news sites but it is rarely part of the sales pitch; nor is it a break-out metric for much more than pageviews.

With better technology, good writers could bring more value. Journalists with particular followings might find the data of authorship could create a better ad experience, with more respectful, well targeted, advertisements for their audience. We know journalists have value to news organizations, but rarely do we have the tools to convert the things that make exceptional reporters into the type of returns that keep great sites running and great journalists employed.

The runaway success of Patreon, which was distributing $1 million a month to creators after 18 months in operation, shows that readers value individual writers. Crowd funding focused on particular creators shows value in attaching identity to a project. That value remains mostly untapped in the media industry.

Elsewhere in the news industry, authorship information is often baked into video and photo files’ metadata making those formats easier to track, receive credit on and profit from.

Without better technology, it is all too easy for bad writers, click-bait and author-less content-scrapers to crowd out the market and bite into profits. It becomes even harder to value writers’ work and pay them without better tools to understand their success.

The barely used specifications of Schema.org have one remedy. Schema.org, a rather complex metadata format sponsored by all the major search engines and implemented by very few news sites (though The New York Times features prominently among those few), has a method for designating authors in the body of the page with a mix of properties on existing and added HTML tags. What happens after that? Not much. Though search engines may index this data, mechanisms to return worth for their use remain lacking. Without them Schema.org seems likely to go into decline, as hcard has.

Even better would be a structural element in HTML documents. Everything on the internet is made by someone. It would be useful to have an element, among all those headers, that would let us indicate who wrote articles, perhaps even all the people who contribute to a work on the web. A structural tag could travel with that content, no matter how it gets distributed or shared.

If the digital journalism industry is to survive, we’ll need to start building better tools. To do so we will all have to work together: engineers, ad ops, sales and writers. Without perspectives from all parties there is a lot of potential we have already missed. Let’s not repeat that mistake.


This article is written in anticipation of working with developers and designers in the news industry at SRCCON. I’ll be facilitating a session, along with the amazing Jarrod Dicker, Head of Ad Product and Technology at The Washington Post, to consider ways the newsroom can lead the way in building tools that serve editorial needs alongside sales so that we can all stay in business.

If you’re going to SRCCON, I hope to see you in the session!