Workflow:Workflow for preserving research data using Archivematica, Fedora, Hydra and PURE

	Workflow for preserving research data using Archivematica, Fedora, Hydra and PURE
Status:	Experimental
Tools:	Archivematica; Fedora Commons; Samvera; PURE;
Input:	Research Data
Output:	Research Data stored in Archivematica and disseminated via Fedora
Organisation:	University of York

Workflow description[edit]

This workflow uses Archivematica, Fedora, Hydra and PURE to preserve and provide access to academic research data. The workflow includes a high level of automation.

PURE is the Current Research Information System at the University of York. This is where researchers enter metadata about the dataset they are depositing
Once metadata is entered into PURE, library staff contact the researcher to request that the data is uploaded
Upload is carried out via an online form - this upload form is part of the Research Data York application (a bespoke hydra based application)
Uploaded data goes into a directory that is watched by Archivematica and here it is arranged into a SIP structure
Archivematica picks up the SIP and processes it - the creation of an AIP is fully automated
The AIP is stored. A DIP is not created by default
Metadata about the dataset is available in the data catalogue
If the dataset is requested, there is a manual approval step (within the Research Data York application) and then Archivematica automatically creates a DIP
This DIP is passed to Fedora
The user is notified when the data is ready
Data is available for download

The workflow was created as part of the Filling the Digital Preservation Gap project and is heavily based on an implementation plan included in the Phase 2 project report which can be found on Figshare. A description of how the workflow was implemented as a proof of concept is included in the Phase 3 project report which is also on Figshare.

Purpose, context and content[edit]

The purpose of this workflow is to preserve and disseminate research data in an automated fashion. Research data is a valuable asset produced by academic institutions and should be retained so that findings can be validated. Some of this data may have longer term re-use potential, particularly where it can not be replicated. At the University of York our Research Data Management policy states that research data should be retained for ten years from date of last access. This means that even for datasets that are only occasionally accessed, the retention period may be much longer than ten years.

Evaluation/Review[edit]

This workflow has been created as a proof of concept at the University of York. It is due to move into production in May 2017.

Jenny mitcham (talk) 16:29, 3 March 2017 (UTC)

Workflow for preserving research data using Archivematica, Fedora, Hydra and PURE
Status:	Experimental
Tools:	Archivematica Fedora Commons Samvera PURE
Input:	Research Data
Output:	Research Data stored in Archivematica and disseminated via Fedora
Organisation:	University of York

Workflow:Workflow for preserving research data using Archivematica, Fedora, Hydra and PURE

Workflow description[edit]

Purpose, context and content[edit]

Evaluation/Review[edit]

Navigation menu

Search