Archivematica

From COPTR
Jump to navigation Jump to search
Archivematica is a digital preservation system that automates the process of preparing digital objects for ingest into a repository and an access system
Homepage:https://www.archivematica.org
License:AGPL version 3
Appears in COW:Digital archiving workflow (high-level), Extraction from the Pure Research Information System and transformation for loading by Archivematica, Workflow for preserving research data using Archivematica, Fedora, Hydra and PURE

Description

Archivematica is a digital preservation system that automates the process of preparing digital objects for ingest into a repository and an access system, ingesting them into archival storage and providing access to the archived material as well as uploading access copies to an access system. The process is monitored and controlled through a Web-based dashboard that co-ordinates a suite of micro-services. It relies on normalisation with preservation as the original object and comprehensive PREMIS metadata in METS.xml as its primary preservation technique.

Provider

This project is managed by Artefactual Systems. It began in collaboration with the UNESCO Memory of the World's Subcommittee on Technology and the City of Vancouver Archives, but continues active development along with its partners at the University of British Columbia Library, the Rockefeller Archive Center, Simon Fraser University Archives and Records Management, Bentley Historical Library and a number of other collaborators.

Platform and interoperability

Installation

Archivematica may be installed directly on a Linux system. The following operating systems are supported:

  • Ubuntu 14.04 64-bit Server Edition
  • Ubuntu 16.04 64-bit Server Edition (beta)
  • CentOS 7 64-bit

Other Linux distributions should work, but will require customization of these installation instructions.

Archivematica has a long list of software it depends on. They are installed when Archivematica is installed on a system.

Note that it is possible to install some of the components on separate machines in order to improve performance, such as:

  • MySQL
  • Gearman
  • Elasticsearch (optional as of Archivematica 1.7)

Storage

Types of storage systems used with Archivematica include local file-based systems; cloud-based storage such as Amazon S3, Microsoft Azure and OpenStack Swift; and specialized storage tools and services such as LOCKSS, DuraCloud and Arkivum. Artefactual in partnership with DuraSpace offers ArchivesDirect as an Archivematica and DuraCloud hosting option.

Users can ingest content manually via the transfer tab or can use automation tools to automatically ingest content from designated source locations such as a folder on a network drive or from systems such as DSpace, Islandora and Dataverse.

Access

Dissemination Information Packages to access systems such as AtoM or ArchivesSpace, where rich metadata enhancement can be undertaken for discoverability and access purposes.

Functional notes

Archivematica uses a micro-services approach, which means it acts as a wrapper for many task-specific applications such as the BagIt library, Clam Anti-Virus, DigiKam, FFmpeg, FITS (File Information Tool Set), ImageMagick, Inkscape, OpenOffice.org, and 7-Zip. The typical workflow is for the curator to assemble a transfer package in the filesystem: a script is provided for setting up the right folder structure or the structure can be assembled manually for some workflows, then digital objects are added to one folder and contextual information (submission documentation in the form of e.g. transfer forms, donation agreements) to another. The package is moved to an input folder 'watched' by the main Archivematica Web tool. Through the Web interface, the curator can decide to accept or reject the transfer. If the transfer is accepted, the tool performs an initial analysis – calculating checksums, assigning UUIDs, scanning for viruses, identifying formats, extracting metadata – and then offers to create a Submission Information Package (SIP); it is also possible to create one or more SIPs manually. Metadata (simple Dublin Core and PREMIS 2.2 rights/restrictions) can then be added to the SIP before it is ingested. At ingest, the curator can choose various routes such as Preservation (where the digital objects are normalised to archival formats and transformed into an Archival Information Package, or AIP), Access (where the digital objects are normalised to dissemination formats and transformed into a Dissemination Information Package, or DIP), repackaging without normalisation, or many combinations of the aforementioned. Further functions are provided for moving AIPs into archival storage and uploading DIPs to AtoM or another access portal. Workflows and decision points are configurable via preconfiguration settings in the administration tab of the web-based dashboard.

Documentation, community, and support

The online documentation for Archivematica includes a User and an Administrative Manual.

The project wiki provides, screencasts, requirements specifications (including use cases, activity diagrams, recognised significant properties of various media and media preservation plans) and a description of the technical architecture.

Community support is available through the Archivematica Discussion Group. Several region-based user groups exist.

Archivematica Camps are intended to provide a space for anyone interested in or currently using Archivematica, to come together, learn about the platform and share their experiences.

Sample data for testing Archivematica is available online and when Archivematica is installed.

Artefactual Systems, Inc., the primary developer of Archivematica, also offers support options.

Digital POWRR offers a How-to Tech Tutorial (PDF).

Archivist Ethan Gates offers a presentation entitled A Place Where You Process: An Introduction to Archivematica Workflows (slides,PDF).

Usability

The majority of operations are accomplished through a Web-based graphical user interface.

Reports on the ease of installation and the robustness of the system are mixed but improving. Example experiences of installing Archivematica (note that current version is 1.7):

Expertise required

The system is easy to use, though as it draws heavily on the OAIS Reference Model some familiarity with that model is needed to understand the workflows Archivematica supports. When installing directly on a Linux desktop or server, even if it is deployed in a virtual machine, a little technical expertise is required (e.g. for setting up ports correctly).

Influence and take-up

Archivematica is used by at least 30 organisations.

Standards compliance

The functionality of Archivematica is clearly based on that defined by the OAIS Reference Model. The Archival Information Packages generated by the system use the BagIt packaging format, in conjunction with a METS packaging manifest incorporating PREMIS metadata.

Development Activity

All development activity is visible on GitHub: http://github.com/artefactual/archivematica/commits

Documentation and information on the latest Archivematica release is available here: https://www.archivematica.org/en/docs/latest