“Digital Footprint Word Cloud,” Penny Bentley. (Creative Commons License)

The Archivist’s Shoes

Can you manage your personal digital footprint?

King Features Weekly
Published in
12 min readApr 28, 2015

--

David Cohea

This week — April 26-May 2 — is Preservation Week, a celebration of cultural preservation sponsored by the American Library Association. Libraries all over the country are hosting events and activities to preserve personal and shared collections.

Since I’ve been spending time these past few weeks addressing the archival challenges and solutions for newspapers, I thought it worth stopping to consider issues of personal archival in the exploding digital universe.

I mean, this thing is creeping fast into everyone’s back yard …

* * *

When my younger brother died suddenly in 2008, he left scattered remnants of a life his family knew little about. From his apartment in Salem, Oregon, I retrieved his Powerbook laptop and an external drive. A journeyman photographer, there were thousands of .raw uploads. Many more analog slides filled boxes and ring-bound sleeves. There were assorted stories he had written for several weekly newspapers. Many resume cover letters, too. (Finding full-time work as a photographer was hard even back then.) I was able to get into a few (but not all) of his email and other online accounts and correspond with business acquaintances. With that access (and copies of his death certificate) we finished his financial life.

But what of the person? He had been gone a long time. My parents and siblings grieved to know the son and brother they had lost. I took it up on myself to gather what I could of his scattered digital and analog traces and present some of that into a narrative on a memorial blog for the next several years.

I gathered what I could, told the story as best I could, and then packed everything into two large plastic bins I keep in an attic closet.

I gathered what I could, told the story as best I could, and then packed everything into two large plastic bins I keep in an attic closet. I don’t think that much about what’s up there anymore. The dead pass quickly into oblivion; few people I know any connection left. Some of his photography is wonderful; a coffee table book might be nice, but what a flooded market that’s become. His own smiling face in photographs is more dear to us.

Who knows what will happen to that archive, now half-digitized and shoddily piled together, awaiting, perhaps, a relative with more time or inclination on their hands to do something more with it. An incomplete job — enough to put his life’s remains in one place, but short of the truth and not very durable.

* * *

At the other end of personal archival are digital collections that could raise the very dead. AI-pioneer Ray Kurzweil has digitized hundreds of boxes of memorabilia about his father, a concert pianist and conductor who died young. Recordings, filmed performances and interviews, transcripts, compositions, photos, letters, papers — all of this is being scanned and then fed into a database, with the hope that at some not-too distant day, digital technology will be capable of awakening an avatar of his father perhaps more real than the man himself.

Could a born-digital individual in his complete uploading become an eternal presence in our world, a sort of mental hologram whose life is forever fixed in phosphorescent amber?

Would they thank us? Or would their ghostly lament haunt the wires of the Web?

* * *

Is every person who carries on an online life responsible for its archival?. My brother’s online life was a lot smaller than his analog life — he died before social media really took off. Eight years later, the digital scale is quite different. My digital footprint is surely small compared to the digital effluvia of true digerati, but what should I do with what I’ve created that might serve the information universe?

Because I’m in my late ‘50s — having started work at a newspaper back when manual typewriters could still be found on the desks of grizzly old newshounds — I have two archival personas, one an analog collection of rather fixed size, and the second the ever-burgeoning digital domain of my lifework’s second act.

I have a closet and several file cabinets of journals, notebooks, school papers, essays, book reviews, articles and poems. There are several thick portfolios of corporate communications — newsletters, ads, brochures, programs, annual reports and speeches. There are boxes of photo prints of girlfriends and kids and bands and vacations, back from when you spent $20 bucks to develop 36 prints from a roll of film at the local drugstore. There’s a reel of film I shot in high school, some audio tapes of garage bands I played in, one videocassete of my last band performing at some punk hootenanny (we were the only rock n rollers). I also have hundreds of books in my library, glommed along avenues of interest that someone might possibly be interested in having first crack at (Shakespeare? Plato? Archetypal psychology? Mythology? Pulp thrillers?).

I’ve tried scanning some of this paper record, but so far only one tenth of pile exists also in digital.

Then there is my digital self. In the seven years since my brother died, the task of personal digital archival has become so much more complex. Even if you aren’t a writer like myself, the digital footprint one can amass is immense. Consider these broadest of categories one now includes in one’s personal digital domain:

  • Contents on my work and home computer (3 million files?), plus what’s on the various devices (iPhone, iPad, iPod) I ferry.
  • Snapshots taken with a cell phone and stored in iPhoto, er, I mean Photos.
  • Email accounts, some of which I’ve stopped checking because so much spam is sent to them (how many do you have, and how many messages are on them)
  • Social media accounts — Facebook and Twitter posts going back years.
  • Curated material (Pinterest, Pocket and Trove)
  • Blogs, some of them deleted for various purposes. Some under pen names. (What about our anonymous and contrived personas?)
  • All the stuff on drives that I don’t own — images and articles and songs I’ve downloaded.

All of this routinely keeps piling up behind me as I multitask madly and blindly forward. And since there’s never enough time for what I have to get done today, of course there’s no time left to account for what I did yesterday.

What else? My click trail is gregarious, my search history ginormous.

Chunks of me are out there in an asteroid belt of digital limbo. The first blog I wrote — massive — was deleted from visible cyberspace, but its knocking around somewhere in Google’s servers. So are my MySpace writings. (I wish I could say Good Riddance to them, but the deleted is like dark space — invisibly stretching the regions beyond.

I’m backed up on devices that fail. There’s a drawer full of old Syquest, Jazz and Zip disks, none of which can be read by anything any more.

Files are in gibberish formats like WordPerfect and Wordstar and Quark Express.

I exist everywhere I’ve bought something — my password, credit card info, street address. Banks have my credit history, the government my tax returns, insurance companies my medical records. Who knows what the NSA knows of me. Or ISIS. Or hackers. They may know more about me than I do.

And then there’s the Cloud …

* * *

Perhaps it is because the digital footprint is growing so fast and large that no one talks much about digital survival. Increasingly, exponentially, awesomely, awfully, digital workers and consumers exist on the rising tsunami wall of Big Info.

How do you index a galaxy? Should we tag every shooting star?

How do you index a galaxy? Should we tag every shooting star? How much could my pea-shooter of infinity dent the edifice? Whatever I don’t account for, is history that much more a lie? In these days of self-apotheosophising, such questions actually sound legitimate.

Do we at least have the responsibility of naming what of our digital selves we would like to preserve, if only to spare a loved one the grief of figuring us out after the fact? Wouldn’t it be far easier (and merciful) just to delete and forget?

There is a danger that only those with large material resources will be digitally preserved for the future. Will our world disappear when only theirs is remembered?

Here more than anywhere I see the cultural calamity we could be creating. What if 99 percent of our properly archived information is representative of our population’s one percentile It can easily happen. Fate is blind and indifferent. Companies should have clear archival policies and procedures for the information they generate, and individuals need similar guidance deciding how to retain their histories.

* * *

Clifford Lynch is the director of the Coalition for Networked Information, which is working on comprehensive standards for developing and managing networked information. If there’s a ground zero for the new information universe, it could be there.

In his essay, “The Future of Personal Digital Archiving: Defining the Research Agenda” Lynch overviews the scale of issues relating to personal archival:

Today, for the vast majority of the general public, simply determining the scope of an estate that includes digital materials (private, shared, and acquired) scattered across a wide range of services and storage is a formidable task. Resolving issues around ownership, inheritance, and meaningful transfer of access, possession or control (the clumsiness of the language here itself suggests the complexity of the problem) is a tangle of legal, contractural, technical challenges further complicated by a lack of overall social consensus in many areas. (5)

If he’s right, the issues of personal archival are as knotty as those of bridging all information networks — we barely can conceive the issue, and laws regarding copyright and access are woefully out of date.

What I’d like to see is the kind of software app used by tax-preparation companies. You complete a questionnaire that walks through all the required aspects of digital archival, entering pertinent information as you go. The result would be an actionable task list for creating a comprehensive, searchable and durable digital archive.

The same sort of app could easily be formatted for to suit the specific archival challenges of newspapers as well, or any private or public company engaged in information collection and exchange.

Anyone hear of such a thing?

Photo by my brother Timm, ca. 2007.

Often I wonder if I captured my brother’s archival self at all. His traces only say so much, his intentions were blurry at best, and mine are colored by kinship ties he may not have shared. (He and I look like twins separated by eight years of births, but we only emailed each other infrequently.)

Clifford Lynch put it this way in the same paper:

Note that at the time of an individual’s death his or her digital life is always a mixture of the deliberate (intentionally saved and retained) and the accidental (once saved and never subsequently weeded, or just there by happenstance and never cleaned up.) this mix will vary. Determining the intent of the collector/creator may be difficult or even impossible in many cases, which will greatly complicate the interpretation of these materials. Further, identifying things that were being kept for personal sentimental value is very hard, and the longer-term significance and importance of such materials may be very difficult to evaluate.

Are hybrid (analog and digital) archives especially difficult to read, lost in a shatter of media like a satellite that’s been blown apart?

What if my brother were a fully created digital person as conceived by Ray Kurzweil? Would more data (and kickass algorithms) allow me to converse with him to finally get the record straight?

Recently as I read about how the information big bang may possibly result in a knowledge universe in which everything that is knowable is linked and retrievable. (At least, that one possible universe, utopian grade.) I wonder if a digitally archived past on such a scale might bring an entirely new sort of history into being, living, breathing according to its tweets and cat videos and downloads and searches, a brilliant Now eternally just flying massively past us out of sight.

I wonder if a digitally archived past on such a scale might bring an entirely new sort of history into being, living, breathing according to its tweets and cat videos and downloads and searches, a brilliant Now eternally just flying massively past us out of sight.

We’re going to need crunching power to the googol-plex; some bodacious algorithms, too. And a medium of storage vellum in which all things are readable, no matter when or how they were digitally created.

Will the picture become sharper and more coherent the more data we add? Or will we conclude in 2050 or 2225 that we had a much better sense of history back in the newsprint 1950s, back when we relied so much more on our cultural occipital cortexes to fill in the blanks with what we believed we saw?

* * *

My father just celebrated his 88th birthday. His will declares that his archives — all paper — will become my responsibility when he passes on. Now, my dad had a life on this earth; he was around a long time; he knew theologians and captains of industry and a Secretary of State. His story in this world is sizable, and it will be my job to process it into history — scan it if possible, catalog it, store it, maybe try to find a research library interested in its little take on latter 20th-century America.

Has all that already been dwarfed by what’s coming 2016?

Who’s paying attention to anything with all the notifications streaming in?

Perhaps the archivist is the shoemaker, the only one who knows enough to prepare us for our next step.

Cobblers, stick to thy last.

* * *

I did find some guidelines for creating a personal archive here. The following summarizes the main points.

  • Organize and name your files appropriately. Have concise names and avoid complex paths. Avoid capitals and spaces, since these can cause problems moving files between operating environments. Use a standard date format (yyyymmdd).
  • Make data self-documenting. Include information that tells about the file, who wrote it, what history it may be been through.
  • Delete what’s not important. Avoid historical clutter!
  • Manage your emails. Delete what has no long-term value. Organize saved mail into subject folders.
  • Select suitable file formats and software. The simplest, standard format endures the longest. For word-processing documents, try using the Open Document Format (.odf). PDF is a standard for preservation of copies which cannot be edited when viewed. For raster images, .TIFF is most durable. Operating systems like Apple OS, Linux and MS Windows are mature and stable.
  • Backup your files. Hard disks will fail. Regularly back up to portable media, preferably an external solid state hard drive. For key files, make additional copies and store them elsewhere. Consider online backup services. Backups should include files that are difficult or impossible to recreate, such as photos, email, data relating to personal finance, professional or business data, license keys for software, digital music, your diary, website, and anything else important to your family.
  • Take care you your hardware and media. Replace computers, servers and storage media cyclically. Keep components clean and free of dust to prevent overheating. Invest in an Uninterruptible Power Supply (UPS).
  • Administer your system. Before making major updates, ensure your old data is compatible with new software. Keep your data secure with good security habits. Consider using passwords and encryption devices.
  • Be aware of intellectual property rights and privacy. Copyright laws extend 70 years after your death; if you want others to use your digital archive, you may want to apply less restrictive licenses like those by developed by Creative Commons Just because you downloaded a file doesn’t mean you own it. Just because you posted something on a social media site doesn’t mean you own it, either. Your digital archive may contain personal information of many other individuals. They have a right to their privacy.
  • Keep current. Technology speeds forward and systems can quickly become inoperable or obsolete. Think about your digital environment; it’s always changing.
  • Consider your legacy files. You may have hundreds or thousands of files that can no longer be read by current software. They might be readable in the future — archival practices should still be in place.
Photo: Timm O’Cobhthaigh

--

--

King Features Weekly

Entertaining extras for community newspapers — today, tomorrow.