King Features Weekly
Local and Thriving
Published in
10 min readApr 23, 2015

--

During a visit to The New York Times in 1959, Marilyn Monroe inspected news clippings in the morgue, accompanied by Lester Markel, center, the Sunday editor. (The New York Times)

The Newspaper Archivist’s Good News

Newspapers write the “first draft of history” — let’s be sure that moment is preserved

By David Cohea

Newspaper archival has, in the past, mostly been bundles stacked to the ceiling, bound editions on a shelf at the local library and reels of murky microfilm.

The old newspaper morgue has developed an inaccessible smell.

That’s all changing fast. As I explored last week in “The Archivist’s Blues,” we are currently at a moment of immense transformation, a veritable Big Bang of digital content that represents the birth of the so-called information universe.

But getting from dusty archives to that digital future seems like a daunting, expensive and time-consuming challenge, because, well, it is. The good news is that professionals throughout the industry are looking intensely at the challenge and finding new and better solutions.

* * *

The change we’re in the midst of is so big that it’s easy to feel that attempting anything is folly.

An analogy to this is found in news about global warming, an event so colossal that individual efforts seem hopelessly moot. In a recent New Yorker piece, Jonathan Franzen says that paralysis of that analysis over global warming is preventing taking action in the present. His analysis of the problem echoes the archivist’s blues:

In Annie Hall, when the young Alvy Singer stopped doing his homework, his mother took him to a psychiatrist. It turned out that Alvy had read that the universe is expanding, which would surely lead to its breaking apart some day, and to him this was an argument for not doing his homework: “What’s the point?” Under the shadow of vast global problems and vast global remedies, smaller-scale actions on behalf of nature can seem similarly meaningless. But Alvy’s mother was having none of it. “You’re here in Brooklyn!” she said. “Brooklyn is not expanding!” It all depends on what we mean by meaning.

The good news for newspapers is that archival— just like global warming — can be tackled much more successfully in increments.

The good news for newspapers is that archival — just like global warming —
can be tackled much more successfully in increments.

But the time to act is now. Without taking steps to build a foundation for digital newspaper preservation, much of the present — and, perhaps more terrifying, community news going back decade, even centuries — can be lost.

“In any call to action we have to ACT, do something, even if it is wrong and has to be re-done later,” says Brad Buchanan, CEO of NewzGroup, a Missouri-based newspaper archival and digital clipping service. “In the face of this bewildering array of technologies and the explosion of information to be captured, we still have to draw a line in the sand somewhere and say, ‘OK, we are going to start here,’ and then continue adapting to changed conditions. Step one is to avoid analysis paralysis.”

So let’s see if we can get through the basic protocols of newspaper archival without Brooklyn expanding on us.

* * *

There are really two main issues to the newspaper archival: First, how to convert analog (print, microfilm) to digital? And second, what to do with born-digital content, the stuff that appears on the newspaper website and never sees print?

For print-to-digital conversion, companies like Archive in a Box and Newsbank can scan and digitizes from print and microfilm archives. But the process isn’t cheap. Digitizing newspaper from print or microfilm currently costs about $.60 per page. If you have a hundred years of 8-page weekly papers to digitize, the price tag could easily top $25,000. Some government funding is available to help defray the cost, but applying is time-consuming and the money is scarce.

An easier way to get started is with the process is to archive editions that have already been created in an electronic page composition program like Quark Xpress or InDesign, software in use at most newspapers since the 1990s.

Today, most printers require print-PDFs for newspaper runs. Newzgroup offers a great archival opportunity: Publishers send them a copy of their print PDFs, and the company provides the storage.

Up in the Midwest, Buchanan says, that’s a strong selling point. It’s only been four years since mile-wide tornado ripped through Joplin, Missouri, destroying 2,000 buildings and wreaking $2.8 billion in property damage. “The Joplin storm made everyone sit up and pay attention to what could have happened to their newspaper had the storm taken a slightly different track.”

“The Joplin storm made everyone sit up and pay attention to what could have happened to their newspaper had the storm taken a slightly different track.”

If the newspaper is a member of one of 11 state press associations, NewzGroup has worked out statewide access to the archive is to all state association members. The company also runs a “digital clipping service” and pays back newspapers for clippings they send to second-source customers.

I asked Buchanan how simple it is for newspapers to get started:

“We have to make it as easy as possible for newspapers to archive. NewzGroup accepts any newspaper content in a PDF format. This is the file type publishers normally send to the printing presses, so all we are asking is that publishers send it our way as well. If content producers have to go through difficult processes to archive material, it just isn’t going to happen. This is not how they make their money, not their primary focus. We have made it so easy, even the smallest publishers are uploading to us, and that is critical.”

In their operation, NewzGroup is evolving their archival solutions. “We are adapting it to accept all sorts of file types, text, jpegs, etc. We need to have a system that accepts any file type content creators might employ, for ease of use.”

But with such growth comes new challenges. “We need to create the Department of Redundancy Department,” he laughs. “All power systems, storage back-ups, processing portals, and so forth, should be fully redundant, and preferably be housed on distributed data bases, so that any losses can be fully and quickly recovered. “

For content that was originally created in digital format — “born digital” — there are additional challenges. Computer hardware and software have gone through many changes over the past forty years, rendering many files unreadable. File storage can be as haphazard as stored on the drives of some old computers in a back room. Unlike analog media, there isn’t hardcopy backup to start over from; there’s no fallback if something gets lost or destroyed. Companies go out of business or are sold, newspapers shut down, and then who knows where files end up.

Bit rot — the decay and lost of data through media failures, and natural disasters and hurry on the next edition — is the bane of digital preservation.

Aside from aged production files, most community papers now have some Web version of their paper available, and store digital backups of their articles, images and videos. They may have begun to integrate with social media or interactive news apps and data-driven journalism projects.

But who’s archiving all that accumulated data? The consensus is that no one is, neither systematically enough nor with a wide enough vision for the changes quickly coming on the horizon. Most news organizations don’t have the expertise or resources to properly archive or preserve their digital work for future generations — or even to preserve their work a year from now.”

To that end, Ed McCain was named the Digital Curator of Journalism at the Donald W. Reynolds Journalism Institute at the Missouri School of Journalism — the first of its kind. He leads the Journalism Digital News Archive, a strategic change agenda addressing issues surrounding access and preservation of digital news collections. He works with faculty and staff at the Missouri School of Journalism, building a framework of linked programs and functions designed to enhance digital news archives.

The good news is that digital preservation guidelines are slowly extending out into the industry. You can find a very good overview in the report, “Guidelines for Digital Preservation Readiness” by Katherine Skinner and Mark Schultz. (2014). Digital preservation is defined as the “series of managed activities necessary to ensure continued access to digital materials for as long as necessary” — a task that requires planning, care, and coordination over time. Because of the size of the task (one that’s growing exponentially as the digital news keeps flooding in every day), they recommend a pragmatic and incremental approach.

To get started, they suggest the following approach (though in much greater detail):

  1. Inventory: What is the amount (number of files and sizes) and location of your digital newspaper collections? A basic inventory would describe characteristics of the files, including title, content type, format, size, associations (title/issue and page/article information), and file location.
  2. Readiness Spectrum: Usually, the level of detail recorded in an inventory reflects the level of preservation your paper currently supports. Consider it your starting point — version 1.0. The readiness spectrum accounts for the next steps to take by ensuring that the first ones have created a firm foundation from which to work forward.
  3. Essential readiness: This is the planning phase for reaching a more mature level of digital preservation. Instruments need to be considered that will allow digital newspaper asset to accommodate a fuller range of information elements.
  4. Optimal readiness: With enough time, technical staffing and support resources, this final stage of preservation readiness can be achieved. A reliable workstation should be created that can provide a main channel to all of the existing digital newspaper data.

There are numerous other considerations (such as creation and acquisition, preservation partners and permissions, distribution vs. backups, change management, preservation monitoring and recovery from preservation), but these guidelines at least envision the foundation that can be built.

Work is underway to link the efforts of newspaper preservation with the broader effort of memory institutions to preserve our burgeoning digital cultural moment — public libraries, museums, universities and private institutions. The MetaArchive Cooperative works with the Library of Congress to provide preservation infrastructure and expertise to memory organizations rather than outsourcing the service to external vendors.

Both commercial and non-profit parties will gather in Charlotte, NC this May for “Dodging the Memory Hole II” — what is being called “an action assembly.” News publishers and press associations, technologists and researchers, libraries and archives, corporations and funding agencies alike plan to continue hammering at a blueprint for a sustainable digital news preservation and its integration into the information economy.

* * *

Of course, the elephant in the room is one question that too frequently ends the conversation: Who’s going to pay for all this? Archival preservation is down the list a good way from getting out the news, and every newspaper is looking to cut legacy costs.

The elephant in the room is one question that too frequently ends the conversation: Who’s going to pay for all this?

Again, costs can be addressed incrementally. Skinner and Shultz write in their report:

All institutions can do something to prepare their collections for long-term use, and that there can be no one-size-fits-all approach to preserving digital newspapers. Institutions need to be able to tackle the challenges involved in preserving digital newspapers in modular increments. Though they need to be able to understand the entire series of “managed activities” as inter-related stepping stones, they also need to be empowered to produce staged implementations based on their current and future capacities. (vi)

Steps taken on “current capacities” may indeed be small. Creating the initial inventory is itself a work in progress, it can be simply a list on text file. Your print PDFs can be archived off-site.

Currently, there aren’t many ways for newspapers to make money off their archives. If you have a searchable archive of stories, you can charge for access, or make the archive a subscription premium.

McCain is researching is how to better monetize newspaper archives. In January, the Knight News Challenge awarded McCain and RJR a $35,000 grant to develop a long-term model for protecting born-digital news content that provides small newspapers with a healthy means to monetize it.

One fresh idea is coming up with a consistent way to license the newspaper archive to researchers.

“It’s pretty hard to determine the value of the content on its own,” he says. “The factors that might affect the market value of news content include the total amount of it, the nature of the content, the time period covered (five, 10 or 30 years) and other factors. On top of that I think there could be a premium on content gathered from a geographic area such as a state or region. That’s something we want to explore in terms of market analysis.”

NewzGoup recently published a white paper on monetizing newspaper archives in four sections to their blog. (Sections two and three are the most pertinent). Some possibilities they suggest include individuals seeking genealogical data, researchers and academics looking to aggregate data from numerous publications and state and local education or government agencies. Clearly there’s strength in numbers, and the more networked newspaper digital assets become available, the greater their sum value.

How newspapers get there will be a combination of how much they can convert to digital platforms and effectively network their information resources. The newspaper of the near future may of necessity have a different look and feel. A compelling vision was recently laid out in an Editor & Publisher piece by Bill Densmore, “Shoptalk: Imagining the 21st-Century Personal News Experience” envisions how a great local 21st century news service would operate.

Here is the community newspaper with all of its information resources — present, future, and past — fully employed. Densmore, a Reynolds Fellow, has made available a draft version of a larger report detailing the transformation of newspapers into the exploding market for digital information — requiring “infrastructure collaboration around new services that can sustain the values, principles and purposes of journalism for participatory democracy.”

Growing a digital infrastructure linking to the future starts with this birthing-digital moment. This is the crucial starting place for the new news archivist. Commercial needs are vital, but newspapers may have an even greater responsibility to their communities and solid information providers. (As Clay Shirky put it all the way back in 2009, “society doesn’t need newspapers. What we need is journalism.”)

”A newspaper is really a community’s first rough draft of history,” says Ryan Thornburg, who teaches online journalism at the University of North Carolina’s School of Journalism and Mass Communication. “This creates loyalty among the community. Maybe newspapers don’t just archive their stuff, they also become the digital library for the community. The bottom-line has to be there, but their value to the community is incalculable.”

--

--

King Features Weekly
Local and Thriving

Entertaining extras for community newspapers — today, tomorrow.