Difference between revisions of "Libsafe"
(Added METS Category.)
|Line 14:||Line 14:|
Revision as of 17:02, 2 July 2020
LIBSAFE Advanced allows the organizations to create a full OAIS aligned Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes. It's used by large institutions like the British Library, the National Library of Spain or the United States Holocaust Memorial Museum but also by smaller organizations like universities or hospitals.
It allows to manage high volume data archives with ease, having deployments of 6PB in size.
More information: https://www.libnova.com
Key features and benefits:
- Easy to use web interface
- Integrated catalog
- Full preservation lifecycle management
- Object versioning
- Powerful metadata engine (versioning of the metadata, unique metadata and mandatory metatadata) over any schema. Full METS and PREMIS aligment. Allows creating new schemas and migrating one to another. Allows sync to other tools. Fully configurable metadata schemas.
- REST API for ingestion, catalog search and object retrieve.
- Works with any SMB, NFS or Object based storage.
- Microservices: It's possible to expand various functions easily with plug-ins.
- Linear scalability: It is possible to scale the platform to achieve any needed performance.
The OAIS submission agreement concept maps over a "Preservation area" on libsafe, that defines all the preservation policy to apply:
- Sanitizing policy (deleting temp files, etc)
- Preprocessing: new formats generation for files (doc > pdf, ARC > WARC, etc) or metadata pre-processing.
- Checking: fixity checking (antivirus, files and folders names, jhove validation, file size, etc). Each check can be enabled, disabled or let the system choose using a regex expression.
- Exploring: How metadata should be extracted.
- Dissemination: how many copies and in wich "storage pools" should the information be stored. libsafe does not destroy the folder/file structure or filenames.
- Auditing policy: How often the object should be audited by the system in background.
The whole system is "job" based. Every process is a job (ingestion job, auditing job, migration job, etc).
It implements full OAIS framework including:
- DirectSIP - Create a SIP with the browser from a set of folders (one folder > one object) or ingest SIP.
- Preprocessing: launches selected plugings over the content. A preprocessor is a plugin (essentialy a .exe file) that makes any changes on a object. Most common preprocessors can be preloaded souch as ImageMagik, FFMPEG, GhostScript, inkscape and others, but it's easy to extend the functionality.
- Checking: A check verifies something about the object, it's files or it's metadata. A check is a plugin (essentialy a .exe file) that is launched against any elemment (object, any file, etc) if it matches a regex or extension (for ex: *.TIF) and verifies something about it, returning a "OK" or "ERROR". libsafe has about 40 checks (jhove validation, file size, file numbers, etc) bu adding more is very easy. If a check fails the ingestion job stops and the system returns the control to the user to correct the problems and re-launch the job.
- Exploring: Extracts (from XML, CSV, text file or quering other system) and generates metadata (using DROID for instance) and loads it to the catalog. libsafe can load METS files, but reading metadata from custom XML file is easy using XSLT file. It's possible to use conversion or generation plug-ins here also (for instance, quering a OAI-PMH server using the folder name of the object).
- Dissemination: Storage Groups contain Storage Volumnes (mountpoints). Based on the Preservation Area storage policy, libsafe selects the Storage Volumes to store the object. and starts coping them. libsafe stores the path and the hash to each file. In this part of the object, security XML files are also created, containing all the metadata, events, other locations of each file, etc.
- Auditing: Inmediatly after finishing the object copy, libsafe scans again all the object files hasing them again and comparing the hash between the database stored hash, the ingestion job temp files hash, security XML hash and the real files hash on every copy. If the audit job is finished with a OK result, all objects are considered preserved.
- Bulk transforms: Allows massive metadata schema changes (DC > Marc21 for instance), but it's possible to evolve to any schema defining a crosswalk.
- API for metadata versionig: It's possible to edit the objects metadata using an API. Previous versions of the metadata sets are always preserved. It's not really editing, it's versioning.
- Catalog: It's possible to search for any metadata of the object, locate it and visualize it's details, including:
- Object creation details (when, who, etc)
- Metadata details
- Object previuos versions
- Object files and folders hierarchy, being able to download the files or the full object in a AIP.
- Active preservation
- File formats used and risks (using PRONOM among others)
- Metadata formats
- Passive preservation
- Auditing results
- Storage details (where it's this object stored)
- Object events (creation, access, transformation, etc)
It's also posible to query the system using the API and to retrieve files or complete objects using a RESTful API call.
Reports: various reports including collection wide. Users: create, set roles, etc. Settings: Change system settings, submission agreements, etc.
Including options to evolve not only file formats but also metadata schemas.