Art conservators all over the world increasingly find themselves having to up-skill and re-train to provide basic preventive conservation for digital media. The requirements for Time-Based Media Art are even more significant, and concerns about safe storage for these artworks have created an even clearer sense of urgency in trying to define and establish best practices. Framed in the context of Time-Based Media Art, this article aims to serve as a guide for conservators interested in beginning to assess how to move forward with establishing storage for valuable digital assets, whether works of art or critically important conservation documentation in digital form.
The design of good art storage requires the collaboration of architects, engineers, and collections care professionals, especially registrars and conservators. Additional experts are required for the daily operation of a physical art storage facility to provide security, building maintenance, and environmental oversight. Most who use art storage will never build their own facility; they will outsource. For galleries, collectors, small museums, medium-sized museums, and even some large institutions, it is often more cost-effective, practical, and scalable to outsource storage to a service provider. Outsourcing is also incredibly common for storage of digital collections — be it a cloud provider, or an on-premise managed service. In the same way conservators work with physical storage facilities, conservators must access a different vocabulary and knowledge base to effectively collaborate with their IT departments when assessing whether they should outsource digital storage or not. They will need to understand either how to assess vendors or how to plan for and execute the establishment of an in-house digital art storage system.
What Do You Have?
The very first step in a journey towards establishing digital collections storage is to understand the full scope and nature of your collection — for collections of Time-Based Media Art, this is often not fully known. How many artworks you have and their various media formats, duration, and other characteristics will define how much capacity your digital collections storage will require, and thus to a great extent will dictate what options are feasible for your institution from the standpoint of financial and logistical sustainability.
Questions about your collection that will affect storage needs:
1. Do you have analog materials that have yet to be digitized?
2. Does your collection include optical media such as DVDs and CDs, and/or hard drives sitting in physical collections storage?
An initial survey of your collection to establish these facts is simply a must.
Once you understand your digital collections storage capacity needs, it is also critical to understand roughly how much you expect your collection to grow over the course of the next five years. For Time-Based Media Artworks, you can count how many items have been acquired annually for the past ten years, and look at whether there are patterns in how the collection has grown. Questions such as whether the number of acquisitions is steadily increasing or if there was a single large acquisition of a collection of artworks years ago will help guide growth predictions. After establishing a basic understanding of the number of Time-Based Media Artworks acquired per year, take a look at what these works tend to look like materially. Is there a curatorial focus on video art? Are the works very recently created, or do they tend to be at least ten years old? Older video will tend to be Standard Definition, and thus requires less storage space than more contemporary High Definition artworks. Are there multi-channel works or do they tend to be single channel? Are there installations with dedicated computers? For example, an installation-based artwork with an artist-provided dedicated computer should be disk-imaged, thus creating a file roughly the size of the computer’s hard drive. Factor these kinds of characteristics into your growth projections for a more accurate estimate of your five-year storage capacity requirements. Basic knowledge of digitization best practices and the kinds of formats you are receiving from artists — or collaboration with an expert in these areas — is prerequisite to being able to understand one’s long-term digital storage capacity and growth requirements.
IT is Your Friend
Just as a conservator would not endeavor to design or remodel a building, HVAC, or security system, so too with digital collections storage; one must work collaboratively with Information Technology (IT) professionals (internal to your institution, or outside vendors) for the design, build, and maintenance of any digital storage environment. Being able to collaborate with IT requires some preparation.
Collections professionals will benefit immensely from learning a few key pieces of IT terminology and vocabulary. This will assist you in communicating more effectively with your colleagues as well as with vendors. There are resources available online to use in familiarizing yourself with the language used by IT professionals; doing so can facilitate communication about digital storage. Consider educating yourself about these terms and functions to be a part of your job in the same way as we learn about the properties of products that we use in our bench work treatments.
Some Important IT Terminology
Archive: commonly used to refer to data that is not used frequently (if at all) and is therefore stored on more affordable but less performant forms of storage. It has nothing to do with preservation.
Backup: is when a given pool of data is “backed up” to another storage device, and retained as a snapshot of a specific time. It is not uncommon in business settings to see office data backed up on a nightly basis and retained for a period of thirty days. This kind of backup is almost always stored in the same physical location.
DR: stands for Disaster Recovery. This simply means that a second complete copy of the given pool of data is stored off site at another location. In other words, if a disaster were to occur that obliterated the primary collections storage, there would be a good copy elsewhere.
High availability: can be thought of as a disaster recovery copy that is online, networked, and could be relied upon as primary collections storage in the event of a disaster or temporary outage.
Storage appliance: a term used to refer to the actual physical device that provides the digital storage. The term “appliance” is appropriate because contemporary enterprise storage devices often combine many technologies (storage, computation, network) to form a more plug-and-play product.
Gigabit: a unit of measurement used in network infrastructure and describes the speed of your connection from your conservator’s workstation to your institution’s data center. Gigabit is the speed of network infrastructure most commonly found in buildings built at least three years ago.
Ten gig: the faster successor to Gigabit. If you were laying new network infrastructure in a building today, and speed of access to collections storage was a concern, this is likely what you would implement.
Fiber: fiber optic cable is a successor to copper wire infrastructure, permitting data to travel at only 31% slower than the speed of light. Fiber is often utilized (rented as a service) to connect two geographically distributed sites.
Dark fiber: refers to fiber optic infrastructure that is essentially private, or not used by the general public. This means that greater speeds can be achieved due to not having to compete against network traffic. Dark fiber networks are more expensive, and are sometimes consortially operated.
Compute: simply refers to the part of the infrastructure that does the heavy lifting when it comes to any kind of computation. This is where your digital repository software runs, and is what defines how fast activities like video transcoding can happen.
Utilization: refers to how much one is actually making use of a given resource. This is an important factor to understand; it doesn’t matter how fast your network infrastructure is in theory if, in practice, its full ten gigabits of bandwidth aren’t being used. When fine-tuning a system and searching for bottlenecks, understanding how much or how little the different pieces of the puzzle are being utilized can help track down the problem.
Connecting with your IT department in a meaningful way is critical to creating a well-functioning and successful digital storage system. You will need to do more than simply learn their vocabulary; write up some specifications, and start asking questions. If your institution has a Time-Based Media working group, invite them to your meetings and conferences! Ask them to participate in any meetings that involve any kind of digital technology or acquisitions that will involve digital media. If you want to feel heard and valued by your IT department, you’ll need to start by including them and making them feel valued.
While learning IT vocabulary is a critical tool in forging an effective relationship with your IT peers, these colleagues are likely unfamiliar with digital preservation, and the rigorous international standards and vocabulary this expert field has been developing around the long-term preservation of digital cultural heritage. Not only is digital preservation absolutely rich with standards and guidelines, but it has also developed tools for assessing adherence to its own standards and guidelines in complete and rigorous ways, as well as more practical and incremental steps. Reviewing some of the most important standards and criteria alongside IT staff will help conservators become aware of factors to consider in understanding digital collections storage.
Open Archival Information System (OAIS)
It may come as no surprise that one of the first digital preservation standards grew out of professional communities related to the exploration of outer space. By the 1980s, this field was facing the challenge of preserving highly valuable legacy data. In 1982, the Consultative Committee for Space Data Systems (CCSDS) was established to develop standards around the general handling and exchange of data in space research. Eight years later, in 1990, CCDS began collaboration with the International Organization for Standardization (ISO) to put CCDS recommendations under ISO review for consideration as international standards. The first result of this collaboration was seen in the 1999 emergence of a high-level, almost conceptual, reference model that described the terminology and frameworks for long-term storage and the preservation of data, which is called the OAIS. It was eventually defined in ISO 14721:2012 (https://www.iso.org/standard/57284.html).
Key Vocabulary and Concepts Established by OAIS
SIP (Submission Information Package): the data as provided by the “depositor” to the digital repository; depositor could be defined as outside parties, such as an artist studio or gallery, or internal parties, such as conservators or registrars.
Ingest: The process of digital objects entering the digital repository, during which preservation actions are based on policy and procedure.
AIP (Archival Information Package): a container created around the supplied digital objects that provides critical information to facilitate their long-term preservation and use.
DIP (Dissemination Information Package): a package of the digital objects optimized for dissemination, or access.
OAIS was and is a standard developed by a very particular community (Space Data) with its own inherent biases and interests, and is not intended to be any more than a conceptual model or framework. It is not a specification describing how to build an effective system, or a set of criteria for assessing products or services. Furthermore, it is not a roadmap to digital preservation, as digital preservation extends well beyond any particular technology or system and into the everyday functions of an institution’s policy, procedure, and budgets. Despite this, after OAIS’ subsequent adoption by the digital preservation community, many institutions and service providers could be seen claiming to be “OAIS compliant” or as having “OAIS-based” digital repositories or products and services. There was no way of assessing what this actually meant, or what the given institution, service provider, or product accomplished from a digital preservation standpoint. This led to a series of major collaborative efforts involving the Research Libraries Group (RLG), Online Computer Library Center (OCLC), National Archives and Records Administration (NARA), and CCSDS. Ultimately this process culminated in the establishment of a new international standard, “ISO 16363: Space data and information transfer systems — Audit and certification of trustworthy digital repositories.”
International Standards — Criteria for the Certification of Trusted Digital Repositories
Considering actual ISO 16363 certification as a Trustworthy Digital Repository is far overkill for most institutions. As with most certifications, this one serves as certified proof to constituents who may expect such proof (for example, digital preservation experts using a service). Nonetheless, criteria provided by the ISO standard serve as incredibly useful tools for self-assessment. ISO 16363’s criteria fall into three fundamental categories.
1. Organizational Infrastructure: This covers not technical infrastructure, but rather the inner workings of the institution, including governance, leadership, staffing, organizational structure, accountability, policy, financial sustainability, and legal liabilities such as contracts and licenses. This is a vast departure from the OAIS reference model’s focus on the repository as the center of all activity.
2. Digital Object Management: This section digs more into the sorts of topics originally laid out by OAIS, but takes a more pragmatic and specific approach to topics such as acquisition of content, creation of archival packages, preservation planning, archival storage, maintenance of archival packages, information management, and access management.
3. Technologies, Technical Infrastructure, and Security: This final section ties the standard together by addressing overall system infrastructure, criteria for assessing if appropriate technologies have been employed to accomplish the requirements set forth in the Digital Object Management section, and whether these technologies are appropriately hardened from a security standpoint.
Reading the ISO
Currently, the free “magenta book” version (the name for the final format of a CCSDS recommendation before ISO consideration) of ISO 16363 is identical to the full standard, and serves as a free and useful resource to institutions, thus you have an alternative to paying www.iso.org $180 for the ISO PDF. (https://public.ccsds.org/Pubs/652x0m1.pdf)
Levels of Preservation
ISO 16363 can be overwhelming, despite serving as an incredibly useful tool for institutions to think thoroughly and holistically about digital preservation. Not only is it impossible for most institutions to accomplish every bit of this rigorous standard, its verbosity can make it difficult to find a strategic and iterative way forward in achieving some level of digital preservation. Addressing this problem inspired the consortial National Digital Stewardship Alliance (NDSA) to collaboratively create the “Levels of Digital Preservation” — a simple tool that provides a one-page method of understanding the criteria for digital preservation accomplishments, and also defines incremental steps towards best practice. The “Levels of Digital Preservation” accomplish this by establishing five topical areas: storage and geographic location, file fixity and data integrity, information security, metadata, file formats. Each of these areas is described by achievable levels, ranging from basic “level 1” implementation through advanced “level 4,” which illustrates something close to best practice.
Preservation Storage Criteria
OAIS, ISO 16363, and the NDSA “levels” are general digital preservation standards and tools; they do not necessarily provide what is needed for implementation-level technical specifications or requirements of storage. Recent efforts on behalf of the NDSA, with funding from IMLS, have focused on developing community-shared best practice criteria for the granular aspects of preservation storage. The document covers areas such as: content integrity, cost, flexibility & resilience, information security, scalability & performance, physical location, support, transparency
Comments Open on the Draft
The IMLS-funded NDSA document is currently in draft form, and is open for comment via the following Google Group: https://groups.google.com/forum/#!forum/dpstorage.
Time to Act
At this point, you may be feeling overwhelmed with countless standards, tools, and criteria for digital preservation storage. However, moving forward simply requires you to understand that these standards and tools exist, and find simple ways to pick and choose what criteria your institution decides is useful in order to form a set of functional requirements.
- Sit down with your IT department, and consider what you can accomplish using these criteria as your framework.
- For each criterion you select, make sure you have a full rationale for why your institution needs it.
From here, there are multiple options for realizing your goals. One could use these functional requirements as a tool for accomplishing a relatively DIY solution using internal technologies and expertise; as requirements to inform a request for proposals; to vet vendor solutions; or to use as baseline knowledge to better inform your work with a consultant who may guide you through RFP, implementation, and management. No matter the size of the collection or the size of your budget, there are ways of accomplishing baseline digital preservation in many different permutations.
In essence, the basic steps to work toward establishing digital collections storage are as follows:
- Figure out what you have; make a collections inventory that will aid you in producing storage capacity requirement estimates, as well as educated growth projections.
- Make IT your ally, and learn how to speak their language.
- Understand a few key digital preservation standards and evaluation tools.
- Pick and choose from best practice documents, such as the NDSA “Levels of Preservation,” and establish your own set of achievable and sustainable functional requirements.
- Find the most sensible way of moving forward for your context — whether working internally, shopping around for vendors, or relying on consultants.
Before moving beyond this final step and actually implementing a solution, it is critical to take a moment and return to the non-technical aspects of ISO 16363. Ask yourself and discuss with your colleagues if the solution you are considering is one you will be able to afford, operate, and maintain in the long term — without this, it doesn’t matter how many technical requirements are accomplished.
Hopefully this article serves as a useful guide for conservators presently facing the question of how to deal with digital storage and preservation — not only for collections of Time-Based Media Art, but also for valuable and mission-critical digital conservation documentation and research data.
If you are considering your digital collections storage needs, no matter how big or small, I would love to hear from you! Small Data Industries is currently conducting research about these practices and will be presenting findings at this year’s AIC annual meeting. Your input is valuable. The survey takes about 5–15 minutes, and you can take it by visiting…
This post is a version of an article originally published in AIC News 43(1), January 2018. It appears courtesy the American Institute for Conservation of Historic & Artistic Works.