The Core Content Audit

Pinpoint problem content by assigning a score.

Can we determine objectively good content?

A content audit produces an inventory of your website’s media, articles, ad copy, posts, pages, status updates — the whole shebang. It is your bird’s-eye view of the message, its voice and tone, an objective representation of the quality of your content and the architecture of your site.

For sites of any scale — particularly those with handfuls of content creators, subject specialists, instructors, sometimes representing totally disparate departments serving niche user needs — having that snapshot of your sprawl can make inroads for everyone’s sanity. Organizations with any level of churn need to be able to identify and assign orphaned content so that its disrepair doesn’t misinform or otherwise endanger their credibility. More importantly, content-making without a disciplined editorial process can build-up redundant content, like plaque on the gums. The difficulty of maintaining the currency of this duplicate content rises exponentially, thus the faster the integrity of your content overall breaks down.

Just the process of building the inventory can be the worst ever, but what most often follows is a frank arbitrary sense of hard work down the memory hole. “Whew, we’re done — uh, what’s next?” Having the inventory implies nothing about what then to do with it, and it is tempting to look ahead to the so-what anticlimax of an audit and decide that the undertaking isn’t really worth the time.

My thinking was that the completed audit, then, should be a tool — not just a reference. It needs to offer some sort answer to the inevitable question — now what?

Pinpoint problem content by assigning a score

Because no one wants to be the sucker on lone audit duty, it dilutes the ugggh if we can use a system that doesn’t rely on the work of trained content strategists but can be crowdsourced using “objective-ish” heuristics judging the quality of a piece of content.

Ideally, content that needs attention bubbles to the surface.

Imagine a piece of content — an event — compared against a rubric that measured whether it was good. It gets a grade, a solid B — 80%. Another event ranks lower. This comparison, however simple, implies forward movement: time is better spent to improve the content of the second event because the first is good enough.

Can we determine objectively good content?

My high-school English teacher once told me that only when you know the rules of writing can you break them. The value of writing is largely subjective, it says more about the state of the reader than the quality of the writer, but there are nevertheless base axioms you adhere to when you learn: i-before-e-except-after-c, the five-paragraph essay, thesis in the first sentence, and so on.

In the same way, there are objective-ish markers for quality web content supported by user experience research. I stress the research part, because determining and enforcing a score-based content audit can hurt some feelings. What is important about emphasizing the UX when you begin this process is to communicate that scoring isn’t an attack on a person’s ability to write, but a means for optimizing content for user behavior so that the content meets a business goal.

I keep using “objective-ish” because, well, of course this is subjective. The objective-ness in the end doesn’t really matter except as a gut-check, a gut-instinct that helps improve the value of a content audit as an internal tool. We use it help us make decisions about what content needs work.

Here are qualities I think we can score: the content is satisficing, that it answers a question or serves demonstrable need, it is mobile first, concise, accurate, without confusion, and has appropriate voice and tone.


This is really a thing and by grace of English a word: a mashup of “satisfying” and “suffice” that blah blah blah means most folks will skim content for a good-enough answer rather than read and digest the whole thing. On a medium where the average page visit is less than a minute, this kind of hit-it-and-quit-it behavior makes sense. In most cases, the speed of the customer journey is as important if not more than its quality: speed is the quality ( #perfmatters ). Our users are primed to get what they need and move on.

Unless the metric that matters for our content is time engaged, such as for a video, then it behooves creators to craft content that is optimized for this behavior. In most cases, we are judging scannability. The heuristic for a satisficing user experience involves the presence of markers such as use of the inverted pyramid, simple descriptive headlines, not infrequent use of relevant links in the text that people use as indicators of particularly relevant areas.

What Question Does it Answer?

An exercise I think is particularly useful is to try to determine what question does this content answer? Where it’s difficult to determine, there is a problem with the content. Your page about business hours answers the question: “when are you open?” The most satisficing answer, of course, is “right now” — right at the top.


The questioning of relevancy can be a little tender and awkward, but it can be insightful to determine whether there is demonstrable need for that piece of content, particularly in terms of business goals.

Having to produce clickbait to get eyeballs on ads is totally legit. Relevance is probably unique to the organization and its success metrics, but not all content is created equal, some content has larger audiences, greater demonstrable need, and it’s useful to identify these — if only to know what’s not working.

Mobile first

Mobile-first content is tailored for an increasingly mobile audience. The manyfold value includes its future-proofing as well as how that content is delivered, the point of mobile-firstness being that it is point-of-need.

Brief, true, clear, appropriate

— or “everything else.” The content should be as brief as necessary. Where facts are presented it needs to be accurate. The message should be clear. The content needs to be relevant.

A simple rubric

The idea is that by grading these qualities on a scale from 0–5 — low to high — determining the average and then dividing by our chosen scale (5), we assign a score to that piece of content. In excel, the formula is something like =AVERAGE(E2:K2)/5 which calculates a percentage. An “About Us” page with a score of 75% needs less attention than a “Contact” page scoring in the low 40s.

That’s all there is to it. Two pieces of content compared against one another to give stakeholders an idea where there are opportunities for improvement.


For my needs, I score content based on my gut-feeling, using simple markers I mentioned above. This of course can be as informal or strict as needed. And even though I set up the inventory, I don’t actually want to be the one performing it (yuck, amirite?), so I drummed-up the following scale so that multiple scorers can be in the same ballpark.

This scale is for a large academic library website, so the language reflects that. Note that I use “audience” rather than “relevance” to illustrate or communicate demonstrable need without creating insult.

Audience — who is the content meant for; for which audiences is there demonstrable need?

Accurate — is the content on this page up to date and accurate?

Voice / Tone — is the voice / tone appropriate for this kind of content?

Clear — there’s nothing particularly confusing about this content, right?

Concise — is the content as brief as possible while being comprehensive?

Satisficing — can the user find or do what they want without a lot of effort? Is it skimmable? Or, for example, can they get the information they want within the average time on page?

Here are some markers for “satisficing” content: descriptive page and section titles; most important content first, followed by more in-depth supporting information; the path to completion is clear.

Mobile First — the page layout and content is mobile first. Meaning that on a small screen things are legible without eyestrain, can be interacted with by touch, align in the correct order.

The qualitative “core” in “core content”

Ida Aalen captured my imagination with “The Core Model: Designing Inside Out for Better Results,” published January 2015 in A List Apart. I jumped on her instructions for performing a content modelling workshop which forces you to align user goals with business goals by identifying the essential parts of a piece of content, where it exists in a user flow (inward and forward paths), and how it should be crafted for small screens.

A Core Model Handout

It’s useful to represent these and similar attributes in the core content audit to add context to the rubric.

  • What question does it answer?
  • Inward paths — how users discover or navigate to the content
  • Forward paths — what opportunities are we explicitly presenting for forward action (e.g., sign-up for a mailing list, navigating to extra content, etc.)
  • Content type — page, event, listicle, video, ….

Integrating a quantitative component, like Google Analytics

The rubric presents the illusion of objectivity but keeping the conversations that result from the audit constructive can be bolstered with select page-level data from an analytics tool that you think helps support an argument. I chose the number of sessions per month, average time on page in seconds, and bounce rate.

This isn’t intended to be a traffic report, but as additional context describing the content in question. Sessions specifically add weight to the score given to the “audience” rating, whereas time on page can indicate either success or failure depending on the type of content: 54 seconds average may be just the time required to find a number or spin-up directions, but on a video may suggest that most users don’t make it to the pitch or watch the last 80% of the video. Bounce rate, too, depends on the content itself: high bounce when looking for hours of operation is dandy, unless this page needs to lead to a product or contact form.

I’ve been thinking about adding Speed Index, and at a certain index — say, 3000 — start pairing-off percentage points from the overall score.


Integrating numbers from Google Analytics during a page audit adds a lot of time to that audit, though. For me, the important part is that there is an inventory, and with lots of people working on this project in just their spare time I’ve opted to make these qualitative and quantitative contextual fields optional. I can’t — or don’t know how to — make these automatic.

What I can do, however, is use services like Zapier or IFTTT to automatically create an entry for a new content item in the Google Spreadsheet once it is published. This relies on our content management system broadcasting a feed or other API on-publish, but this is common enough that I choose to delude myself by thinking it’s a sure thing.

My feeling is that at the point where the rate of content creation overpowers our ability to maintain an inventory is the point where using a CMS is necessary, but that’s not always the case for smaller or less frequently updated web presences.

Nevertheless, I work with a central WordPress Network as the hub of a COPE — “create once, publish everywhere” — system, which utilizes feeds and APIs to syndicate content across platforms. These feeds can be hooks to automatically populate rows into the audit.

This is in-progress

I am so far pretty happy with how the core content audit works, which I now use for several sites. However, just in the course of this writing I’ve tweaked things as they occur and something like this is most definitely a work in progress. I am anxious to hear your thoughts and ideas.

In the meantime, you can fork this Google Spreadsheet.

For my needs, this has been super successful. I can basically get away using otherwise untrained student workers to perform the nitty-gritty data entry and trust that by their following my rubric we are getting a solid visualization of the quality of our content. Problem items bubble to the top. Every so often, it’s a false alarm because of the nature of the content, but it nevertheless inspires someone — me, the content author, or another onlooker — to at least fact-check content that might have otherwise been published and forgotten.

New posts, regardless of their content type, are automatically inserted when published without any scoring. While this automation only extends to a fraction of the content we produce, it at least ensures that the audit remains growing and at least useful as a reference.

Hopefully, you can use it too.

Hi. I wrote this originally at on December 4, 2015. There, I write and podcast about design and user experience for libraries and the higher-ed web. If you liked the core content audit, you might like the other stuff.