The Ideal Digital Archives System
This study analyzes six leading digital preservation and collections management systems used at major cultural heritage institutions worldwide: Archivematica, Preservica, LibNova, AtoM, ArchivesSpace, and Omeka.
Through examination of 43 distinct features across eight functional categories, an integrated model is proposed for an ideal core system that combines the strengths of each platform.
The analysis demonstrates how combining preservation-focused features with robust discovery and collection management capabilities could create a more comprehensive solution for organizations and persons responsible for managing digital archives collections.
- Introduction
Digital preservation and collections management present complex challenges for cultural heritage institutions. Questions remain about how to mainstream professional systems and standards. Current solutions typically focus on either preservation or access, requiring institutions to implement multiple systems to meet their needs.
What is know is that some person or organization has taken the effort to make information content preservable and presentable to future users. And, thus, would expect some person in the future be capable of accessing this information if it answers part or whole of a user query. Using standards is therefore key to match these past and future query needs.
This paper examines how features from six leading systems could be integrated to create a comprehensive solution to this common user story.
Underneath most any information system, across verticals and sectors, lies a digital archives system, perhaps a core set of systems requirements can wield some order?
The systems analyzed include:
- Archivematica: Open-source digital preservation system
- Preservica: Commercial digital preservation platform
- LibNova: Commercial digital preservation platform
- AtoM (Access to Memory): Open-source archival information software
- ArchivesSpace: Open-source archival collections software
- Omeka: Open-source web publishing platform for digital collections
2. Methodology
This study extracted 43 distinct features across the six systems, and categorized them into eight functional areas:
A. Content Discovery
B. Human User Interfaces
C. Discovery Metadata
D. Preservation Standards
E. Digital Object Management
F. Collections Management
G. Security Management
H. System Interoperability
3. Findings
The initial findings reveal that an ideal integrated system would combine both digital preservation and user access needs. This analysis demonstrates the potential for a unified system that combines the strengths of current solutions. The current working version of the system requirements list includes the following features.
4. Systems requirements
Specifications for an Ideal Digital Archives based on existing product functionality.
A. CONTENT DISCOVERY
- Full-text search
Full-text search enables searching across all textual content in the system. The system indexes content and metadata while supporting various search options and relevance ranking.
2. Faceted browsing/searching
Faceted browsing enables filtered exploration of collections using various metadata elements. The system generates facets dynamically based on available metadata while supporting refinement of search results.
3. Advanced search
Advanced search options provide complex query capabilities across multiple fields and criteria. The system supports Boolean operators, field-specific searching, and search result manipulation.
4. Keyword highlighting
Keyword highlighting emphasizes search terms in display of search results. The system highlights matching terms in context while supporting navigation between occurrences.
5. Search result visualizations
Search result visualization presents search results in various visual formats beyond traditional lists. The system supports different visualization options for exploring and understanding search results.
6. Exportable search results
B. HUMAN USER INTERFACES
- Customizable themes
Customizable themes allow institutions to modify the appearance and layout of public interfaces. The system provides theme frameworks and customization options. This feature enables institutions to create branded and content-specific user interfaces using custom page and block layouts.
2. Custom page templates
Custom page templates allow creation of specialized page layouts for standardized and custom content types. The system provides tools for creating and managing page templates.
3. Multiple language interfaces
A multiple language interface support enables presentation of content and interfaces in UTF-8 compatible languages. The system supports translation of interface elements as well as collections content.
4. Mobile-responsive design
Mobile-responsive design ensures effective display on various device types and sizes. The system provides layouts that adapt to different screen sizes and orientations.
5. Exhibition builder
Exhibition builder provides tools for creating online exhibitions from digital collections. The system supports various exhibition layouts, interpretive content, and navigation options. This feature enables rich presentation of digital content in curated contexts.
C. DISCOVERY METADATA
- Descriptive metadata support
Using a standardized approach to metadata and consistent archival descriptions across collections ensures discoverability and interoperability. Dublin Core support should be a core minimum for describing items and collections. As needed, implement for-purpose professional archival description standards that are in widespread international use: (a) Dublin Core, (b) EAD, (c) ISAD(G), (d) ISAAR-CPF, (e) EAC (f) ISDF, (g) DACS, (h) MARCXML.
2. Custom metadata schemas
Custom metadata schemas allow institutions to define their own metadata elements and requirements beyond standard schemas. The system provides tools for creating and managing custom metadata fields and validation rules. This flexibility enables institutions to meet specific descriptive needs while maintaining standardized approaches.
3. Collection organization/hierarchy
Collection organization features support arrangement of materials in hierarchical structures reflecting archival organization. The system maintains relationships between collection components while supporting description at various levels. This hierarchical approach reflects archival principle of original order and context.
4. Tagging
Tagging or “item sets management” enables ad-hoc grouping of related items for organization and presentation such as through the use of labels. The system maintains relationships between items while supporting various organizational schemes.
5. Controlled vocabularies
Controlled vocabularies provide managed lists of accepted terms for consistent description and access. The system supports creation, management, and application of controlled terms across collections. This feature ensures consistency in description while improving search and browse functionality.
6. RDF data model
The RDF (Resource Description Framework) data model implements semantic web standards for describing resources and their relationships. The system stores and manages metadata as linked data, enabling rich connections between resources. This approach supports enhanced discovery and interoperability with other systems.
D. PRESERVATION STANDARDS
1. ISO 14721 OAIS reference model
The Open Archival Information System (OAIS) reference model provides standardized processes for ingesting, preserving, and providing access to digital content. The system implements preservation workflows and metadata management according to ISO specifications which demonstrates conformance to internationally recognized preservation standards.The implementation includes creation and management of Submission Information Packages (SIP), Archival Information Packages (AIP), and Dissemination Information Packages (DIP).
2. PREMIS metadata
PREMIS metadata support enables the capture and management of preservation metadata according to the PREMIS (PREservation Metadata: Implementation Strategies) standard. The system automatically generates and maintains technical, administrative, and preservation metadata about digital objects and preservation events. This metadata is crucial for documenting preservation actions and maintaining the long-term viability of preserved content.
3. BagIt packaging/files
E. DIGITAL OBJECT MANAGEMENT
- Browser content upload
Browser-based content upload enables adding content to the system through web browser interfaces. The system supports drag-and-drop functionality and handles various file types and sizes. This feature simplifies content ingest while maintaining appropriate controls.
2. Bulk ingest capabilities
Bulk ingest capabilities enable adding multiple items or collections simultaneously. The system provides tools for mapping metadata and is able to manage large-scale uploads efficiently.
3. Assign identifiers
Assigning unique identifiers to content facilates collections processing, search and discovery, version control, as well as proving authenticity.
4. Checksum/fixity verification
Fixity checking creates and verifies digital fingerprints (checksums) of files to ensure they haven’t been altered or corrupted over time or in transit. The system regularly compares current checksums against original values to detect any changes or deterioration in the files. This ongoing verification process is essential for maintaining the authenticity and integrity of preserved digital objects.
5. File format identification
This feature automatically identifies file formats within ingested content and validates them against format specifications. The validation process checks if files meet their format’s technical specifications and flags any inconsistencies or corruption. This capability is fundamental to digital preservation as it ensures files are what they claim to be and helps identify preservation risks early in the workflow.
6. File format migration
File format migration automatically transcodes files from obsolete file formats to current, preservation-friendly versions when needed. The system monitors format obsolescence risks and initiates migrations based on preservation planning policies. The process retains original files, ensuring both long-term preservation and authentic representation. These migrations help to reduce the number of formats that need to be maintained over time while ensuring the content remains accessible to future users.
7. Digital object rendering
This feature provides interfaces for viewing and interacting with digital objects in the system, including images, documents, audio, and video files. It includes thumbnail generation, preview capabilities, and options for organizing and describing digital content.
8. Version control and audit trails
Version control tracks changes to digital objects and their metadata over time, maintaining a complete history of modifications. The system records who made changes, when they were made, and what was changed, creating an audit trail of all preservation actions.
9. Multiple copies management
The system automates the creation and verification of backup copies while maintaining appropriate geographic distribution for disaster recovery. The system tracks multiple copies of content and ensures synchronization their metadata to maintain authentic audit trails. Maintaining multiple copies of unique digital files is a critical feature of digital archives systems that manage content for future discovery and use.
10. Digital object storage configuration
Digital object storage configuration supports integration with various storage solutions while maintaining secure access to preserved content. It includes options for configuring storage locations, access methods, and backup strategies to support multiple copies management.
F. COLLECTIONS MANAGEMENT
- Accession management
Accession management tracks new acquisitions from receipt through processing. The system maintains information about donor agreements, restrictions, and processing requirements, including location tracking. Besides facilitating efficient collections management, this feature provides critical provenance and chain-of-custody information.
2. Physical container management
Physical container management tracks real-world storage locations and containers holding archival materials. The system maintains information about storage units, locations, and space management.
3. Deaccession tracking
Deaccession tracking documents removal of materials from collections according to institutional policies. The system maintains records of deaccession decisions, authorities, and disposition of materials. This feature ensures accountability and documentation of collection management decisions.
4. Physical conservation assessment
Conservation assessment tools support evaluation of conservation needs and priorities for real-world physical content. The system provides frameworks for documenting condition issues and treatment recommendations.
5. Digital preservation assessment
Digital preservation assessment supports the evaluation of digital access needs and priorities. The system provides tools for documenting preservation risks and requirements.
6. Research and re-use value assessment
Research and re-use value assessment enables evaluation of the potential for the content to be used in research or repurposed for new content creation, whether for commerical, business, legal, or personal reasons. This feature helps prioritize processing and access decisions.
G. SECURITY MANAGEMENT
- Role and group based security controls
Role-based security controls manage user access through defined roles with specific permissions and restrictions. The system allows administrators to create and manage roles and groups that determine what users can view, edit, and manage. The system provides tools for creating and managing public and private groups, assigning users to groups, and managing group permissions.
2. User authentication
User authentication verifies user identities and manages login credentials for system access. The system supports various authentication methods and can integrate with commercial, institutional, and open standard authentication systems.
3. Audit logging
Audit logging tracks user actions and system events for security and accountability purposes. The system maintains detailed logs of who did what and when, supporting both security monitoring and system troubleshooting. This feature ensures transparency and protecting digital object authenticity.
H. SYSTEM INTEROPERABILITY
- APIs for custom integration
APIs provide programmatic access to system functions and data for integration with external systems. The system includes documented APIs supporting various operations and data exchange patterns. This feature enables custom development and integration with other systems that create, store, or access digital collections. This includes making collection data available to metadata harvesting, research data management, and search indexing systems such as OAI-PMH and Zotero.
2. Workflow integration
3. Plugin architecture
Plugin architecture enables extension of system functionality through a well-documented module interface.
4. Quality assurance
Quality assurance steps provide checkpoints and validation processes at key points in content processing workflows.
