Libsafe

From COPTR
Revision as of 20:27, 24 November 2014 by Aguillermo (talk | contribs)
Jump to navigation Jump to search


libsafe allows to the organizations to create a full OAIS compliant Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes.
Homepage:http://www.digitalpreservationsoftware.com


Description

libsafe allows to the organizations to create a full OAIS compliant Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes. It's used by large institutions like the National Library of Spain or the Royal Spanish Language Academy but also by smaller organizations like universities or hospitals.

It allows to manage high volume data archives with ease, having deployments of 2Pb of capacity.

More information and schenshoots are available on http://www.preservaciondigital.es (in spanish) and http://www.digitalpreservationsoftware.com (in english).

Key features and benefits:

  • Easy to use web interface
  • Integrated catalog
  • Passive preservation (content disseminating, periodic storage hash auditing, etc)
  • Active preservation that allows evolving file formats but also metadata schemas
  • Day one full export
  • Object versioning
  • Powerfull metadata engine (versioning of the metadata, unique metadata and obligatory metadata) over any schema. Loaded with DC, MAC21, EAD and ISAD.G. Allows creating new schemas and migrating one to another. Allows sync to other tools. libsafe allows any metadata schema to be defined and used.
  • REST API for ingestion, catalog search and object retrieve.
  • Full WORM, the software never deletes anything. The user can never delete once ingested.
  • Works with any storage mountable with CIFS.
  • plug-in architecture: It's possible to expand various functions easily with plug-ins.


Detailed description

The OAIS submission agreement concept maps over a "Preservation area" on libsafe, that defines all the preservation policy to apply:

  • Sanitizing policy (deleting temp files, etc)
  • Preprocessing: new formats generation for files (doc > pdf, ARC > WARC, etc) or metadata pre-processing.
  • Checking: fixity checking (antivirus, files and folders names, jhove validation, file size, etc). Each check can be enabled, disabled or let the system choose using a regex expression.
  • Exploring: How metadata should be extracted.
  • Dissemination: how many copies and in wich "storage pools" should the information be stored. libsafe does not destroy the folder/file structure or filenames.
  • Auditing policy: Between how many days the object should be audited by the system in background.

The whole system is "job" based. Every process is a job (ingestion job, auditing job, migration job, etc).

It implements full OAIS framework including:

Ingestion

  • DirectSIP - Create a SIP with the browser from a set of folders (one folder > one object) or ingest SIP.
  • Preprocessing: launches selected plugings over the content. A preprocessor is a plugin (essentialy a .exe file) that makes any changes on a object. Most common preprocessors can be preloaded souch as ImageMagik, FFMPEG, GhostScript, inkscape and others, but it's easy to extend the functionality.
  • Checking: A check verifies something about the object, it's files or it's metadata. A check is a plugin (essentialy a .exe file) that is launched against any elemment (object, any file, etc) if it matches a regex or extension (for ex: *.TIF) and verifies something about it, returning a "OK" or "ERROR". libsafe has about 40 checks (jhove validation, file size, file numbers, etc) bu adding more is very easy. If a check fails the ingestion job stops and the system returns the control to the user to correct the problems and re-launch the job.
  • Exploring: Extracts (from XML, CSV, text file or quering other system) and generates metadata (using DROID for instance) and loads it to the catalog. libsafe can load METS files, but reading metadata from custom XML file is easy using XSLT file. It's possible to use conversion or generation plug-ins here also (for instance, quering a OAI-PMH server using the folder name of the object).
  • Dissemination: Storage Groups contain Storage Volumnes (mountpoints). Based on the Preservation Area storage policy, libsafe selects the Storage Volumes to store the object. and starts coping them. libsafe stores the path and the hash to each file. In this part of the object, security XML files are also created, containing all the metadata, events, other locations of each file, etc.
  • Auditing: Inmediatly after finishing the object copy, libsafe scans again all the object files hasing them again and comparing the hash between the database stored hash, the ingestion job temp files hash, security XML hash and the real files hash on every copy. If the audit job is finished with a OK result, all objects are considered preserved.

Data Management

  • Bulk transforms: Allows massive metadata schema changes (DC > Marc21 for instance), but it's possible to evolve to any schema defining a crosswalk.
  • API for metadata versionig: It's possible to edit the objects metadata using an API. Previous versions of the metadata sets are always preserved. It's not really editing, it's versioning.

Access

  • Catalog: It's possible to search for any metadata of the object, locate it and visualize it's details, including:
    • Object creation details (when, who, etc)
    • Metadata details
    • Object previuos versions
    • Object files and folders hierarchy, being able to download the files or the full object in a AIP.
    • Active preservation
      • Risks
      • File formats used and risks (using PRONOM among others)
      • Metadata formats
    • Passive preservation
    • Auditing results
    • Storage details (where it's this object stored)
    • Object events (creation, access, transformation, etc)

It's also posible to query the system using the API and to retrieve files or complete objects using a RESTful API call.

Administration

Reports: various reports including collection wide. Users: create, set roles, etc. Settings: Change system settings, submission agreements, etc.


Active preservation

Including options to evolve not only file formats but also metadata schemas.


User Experiences

Please go to http://www.preservaciondigital.es (in spanish) and http://www.digitalpreservationsoftware.com (in english) for customer stories.

Development Activity

Please go to http://www.preservaciondigital.es/acerca-de-libnova/blog-de-preservacion-digital/ (spanish) or http://www.digitalpreservationsoftware.com/about-libnova/digital-preservation-blog/ (english)