https://coptr.digipres.org/api.php?action=feedcontributions&user=Chlara&feedformat=atomCOPTR - User contributions [en-gb]2024-03-28T22:00:58ZUser contributionsMediaWiki 1.35.14https://coptr.digipres.org/index.php?title=KOST-Val&diff=3633KOST-Val2020-05-12T12:44:22Z<p>Chlara: eCH link</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).<br />
|image=KOST-Val.JPG<br />
|formats_in={{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} and SIP<br />
|homepage=http://kost-ceco.ch/cms/index.php?kost_val_de<br />
|license=[http://www.gnu.org/licenses/quick-guide-gplv3.html GNU General Public License 3+]<br />
|platforms= should run under Java 1.6 on Windows<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
= Description =<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
The KOST-Val application is used to validate {{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} files and Submission Information Package (SIP).<br />
<br />
KOST-Val supersedes the format validation tools [[SIARD-VAL]], [[TIFF-Val]] and SIP-Val by KOST-CECO.<br />
<br />
<br />
=== Functional Principle ===<br />
KOST-Val complies with the following requirements.<br />
<br />
* '''TIFF validation:''' KOST-Val reads a TIFF file and uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] to validate the structure, the content, and [[ExifTool|ExifTool]] to validate the key properties such as compression, colour space, and multipage. These properties can be configured. <br />
* '''SIARD validation:''' KOST-Val reads a SIARD (eCH-0165 v1 and v2-2017) file and validates the structure and the content. <br />
* '''PDF/A validation:''' KOST-Val reads a PDF or PDF/A file (ISO 19005-1 and 19005-2) and uses [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] by PDF-Tools or [[PdfaPilot|pdfaPilot]] by callas to validate the structure and the content of the PDF file. KOST-Val organises the different error messages into main categories such as fonts, graphics, and metadata. KOST-Val supplies only a limited version from 3-Heights™ PDF/A Validator by PDF-Tools and pdfaPilot by callas. Module J extracts (with [[IText|iText]]) and validates the JPEG and JP2 images contained in the PDF file (depending on the configuration). It is also possible to configure whether the JBIG2 compression is accepted or not.<br />
* '''JP2 validation:''' KOST-Val reads a JP2 file (ISO 15444) and uses [[Jpylyzer]] to validate the structure and the content. <br />
* '''JPEG validation:''' KOST-Val reads a JPEG file (ISO 10918-1) and uses [[Bad Peggy]] to validate the structure and the content. <br />
* '''SIP validation:''' KOST-Val reads an SIP (eCH-0160 v1 and v1.1 ) and validates the mandatory requirements of the SIP specification. The validated requirements are organised into groups such as folder structure, schema validation, and checksum validation. At the outset, a file format validation is performed. <br />
<br />
The results (including information on inconsistencies and errors) are output for every step and written into a validation log.<br />
The validation steps are executed sequentially. Whenever possible the validation shall continue after an error has been detected in order to reduce the number of correction cycles. <br />
<br />
[[File:KOST-Val_FuntionalPrincipleFormatValidation.JPG|800px]]<br />
<br />
=== Third-party applications ===<br />
KOST-Val uses unmodified components of other manufacturers by embedding them directly into the source code. Users of KOST-Val are requested to adhere to these components ‘terms of licence. <br />
<br />
* The TIFF validation module uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] and [[ExifTool|ExifTool]] and evaluates its output further.<br />
* For the PDF/A validation module [[PdfaPilot|pdfaPilot]] or [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] are used.<br />
* The JP2 validation module uses [[Jpylyzer]] and translates the failed tests into appropriate error messages (DE/FR/EN).<br />
* The JPEG validation module uses [[Bad Peggy]] and evaluates the error message "Not a JPEG file" further.<br />
* To extract the JPEG and JP2 images from PDF/A the [[IText|iText library]] is used. <br />
* For the file format identification [[DROID_(Digital_Record_Object_Identification)|DROID]] is used. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.<br />
<br />
<br />
=== Read Me & Download ===<br />
The KOST-Val application is used to validate TIFF, SIARD, PDF/A, JP2, JPEG files and Submission Information Package (SIP).<br />
<br />
KOST-Val, Copyright (C) 2012-2017 Claire Roethlisberger (KOST-CECO), Christian Eugster, Olivier Debenath, Peter Schneider (Staatsarchiv Aargau), Markus Hahn (coderslagoon), Daniel Ludin (BEDAG AG)<br />
<br />
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see GPL-3.0_COPYING.txt for details.<br />
<br />
You can download KOST-Val under http://github.com/KOST-CECO/KOST-Val/releases. For installation instructions please check the [http://github.com/KOST-CECO/KOST-Val/releases manual (DE/FR/EN)].<br />
<br />
<br />
=== SIARD format ===<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification https://www.ech.ch/ech/ech-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations.<br />
<br />
=== Future ===<br />
See http://github.com/KOST-CECO/KOST-Val/issues <br />
<br />
<br />
=== Feedback & Issues ===<br />
Feedback about KOST-Val is very welcome at http://github.com/KOST-CECO/KOST-Val/issues or kost-val[at]kost-ceco.ch<br />
<br />
.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->* '''ZBW:'''<br />
** KOST-Val v1.6.0<br />
** The tool is very easy to install and to handle. <br />
** The output in xml-Format (open in a browser to have a table) is easy to understand<br />
** Running the JPEG-Module against almost 2,400 JPEGs has only lasted 7 minutes<br />
** The tool recognises fake-JPEGs (jpeg-extension, but no jpeg inside) and issues with jpegs and can differentiate easily between these two cases.<br />
<br />
.<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/KOST-CECO/KOST-Val/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/KOST-CECO/KOST-Val/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/KOST-CECO/KOST-Val/commits/master.atom</rss><br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=KOST-Val&diff=3062KOST-Val2017-11-28T08:55:28Z<p>Chlara: Update Third-party applications</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).<br />
|image=KOST-Val.JPG<br />
|formats_in={{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} and SIP<br />
|homepage=http://kost-ceco.ch/cms/index.php?kost_val_de<br />
|license=[http://www.gnu.org/licenses/quick-guide-gplv3.html GNU General Public License 3+]<br />
|platforms= should run under Java 1.6 on Windows<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
= Description =<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
The KOST-Val application is used to validate {{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} files and Submission Information Package (SIP).<br />
<br />
KOST-Val supersedes the format validation tools [[SIARD-VAL]], [[TIFF-Val]] and SIP-Val by KOST-CECO.<br />
<br />
<br />
=== Funtional Principle ===<br />
KOST-Val complies with the following requirements.<br />
<br />
* '''TIFF validation:''' KOST-Val reads a TIFF file and uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] to validate the structure, the content, and [[ExifTool|ExifTool]] to validate the key properties such as compression, colour space, and multipage. These properties can be configured. <br />
* '''SIARD validation:''' KOST-Val reads a SIARD (eCH-0165 v1 and v2-2017) file and validates the structure and the content. <br />
* '''PDF/A validation:''' KOST-Val reads a PDF or PDF/A file (ISO 19005-1 and 19005-2) and uses [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] by PDF-Tools or [[PdfaPilot|pdfaPilot]] by callas to validate the structure and the content of the PDF file. KOST-Val organises the different error messages into main categories such as fonts, graphics, and metadata. KOST-Val supplies only a limited version from 3-Heights™ PDF/A Validator by PDF-Tools and pdfaPilot by callas. Module J extracts (with [[IText|iText]]) and validates the JPEG and JP2 images contained in the PDF file (depending on the configuration). It is also possible to configure whether the JBIG2 compression is accepted or not.<br />
* '''JP2 validation:''' KOST-Val reads a JP2 file (ISO 15444) and uses [[Jpylyzer]] to validate the structure and the content. <br />
* '''JPEG validation:''' KOST-Val reads a JPEG file (ISO 10918-1) and uses [[Bad Peggy]] to validate the structure and the content. <br />
* '''SIP validation:''' KOST-Val reads an SIP (eCH-0160 v1 and v1.1 ) and validates the mandatory requirements of the SIP specification. The validated requirements are organised into groups such as folder structure, schema validation, and checksum validation. At the outset, a file format validation is performed. <br />
<br />
The results (including information on inconsistencies and errors) are output for every step and written into a validation log.<br />
The validation steps are executed sequentially. Whenever possible the validation shall continue after an error has been detected in order to reduce the number of correction cycles. <br />
<br />
[[File:KOST-Val_FuntionalPrincipleFormatValidation.JPG|800px]]<br />
<br />
<br />
=== Third-party applications ===<br />
KOST-Val uses unmodified components of other manufacturers by embedding them directly into the source code. Users of KOST-Val are requested to adhere to these components ‘terms of licence. <br />
<br />
* The TIFF validation module uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] and [[ExifTool|ExifTool]] and evaluates its output further.<br />
* For the PDF/A validation module [[PdfaPilot|pdfaPilot]] or [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] are used.<br />
* The JP2 validation module uses [[Jpylyzer]] and translates the failed tests into appropriate error messages (DE/FR/EN).<br />
* The JPEG validation module uses [[Bad Peggy]] and evaluates the error message "Not a JPEG file" further.<br />
* To extract the JPEG and JP2 images from PDF/A the [[IText|iText library]] is used. <br />
* For the file format identification [[DROID_(Digital_Record_Object_Identification)|DROID]] is used. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.<br />
<br />
<br />
=== Read Me & Download ===<br />
The KOST-Val application is used to validate TIFF, SIARD, PDF/A, JP2, JPEG files and Submission Information Package (SIP).<br />
<br />
KOST-Val, Copyright (C) 2012-2017 Claire Roethlisberger (KOST-CECO), Christian Eugster, Olivier Debenath, Peter Schneider (Staatsarchiv Aargau), Markus Hahn (coderslagoon), Daniel Ludin (BEDAG AG)<br />
<br />
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see GPL-3.0_COPYING.txt for details.<br />
<br />
You can download KOST-Val under http://github.com/KOST-CECO/KOST-Val/releases. For installation instructions please check the [http://github.com/KOST-CECO/KOST-Val/releases manual (DE/FR/EN)].<br />
<br />
<br />
=== SIARD format ===<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification http://www.ech.ch/vechweb/page?p=dossier&documentNumber=eCH-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations.<br />
<br />
<br />
=== Future ===<br />
See http://github.com/KOST-CECO/KOST-Val/issues <br />
<br />
<br />
=== Feedback & Issues ===<br />
Feedback about KOST-Val is very welcome at http://github.com/KOST-CECO/KOST-Val/issues or kost-val[at]kost-ceco.ch<br />
<br />
.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->* '''ZBW:'''<br />
** KOST-Val v1.6.0<br />
** The tool is very easy to install and to handle. <br />
** The output in xml-Format (open in a browser to have a table) is easy to understand<br />
** Running the JPEG-Module against almost 2,400 JPEGs has only lasted 7 minutes<br />
** The tool recognises fake-JPEGs (jpeg-extension, but no jpeg inside) and issues with jpegs and can differentiate easily between these two cases.<br />
<br />
.<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/KOST-CECO/KOST-Val/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/KOST-CECO/KOST-Val/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/KOST-CECO/KOST-Val/commits/master.atom</rss><br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=DROID_(Digital_Record_Object_Identification)&diff=3053DROID (Digital Record Object Identification)2017-10-30T10:37:46Z<p>Chlara: Development Activity</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=DROID (Digital Record Object Identification) is a software tool developed to perform automated batch identification of file formats.<br />
|image=<br />
|homepage=http://digital-preservation.github.io/droid/<br />
|license=BSD License<br />
|platforms=Java 6 Standard Edition<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:File Format Identification]]<br />
[[Category:Metadata Extraction]]<br />
<br />
<br />
= Description =<br />
DROID (Digital Record Object Identification) is a software tool developed to perform automated batch identification of file formats. DROID is designed to meet the fundamental requirement of any digital repository to be able to identify the precise format of all stored digital objects, and to link that identification to a central registry of technical information about that format and its dependencies. <br />
<br />
DROID uses the PRONOM [http://www.nationalarchives.gov.uk/aboutapps/pronom/droid-signature-files.htm signature files] to perform format identification. Like PRONOM, it was [http://www.nationalarchives.gov.uk/information-management/manage-information/policy-process/digital-continuity/file-profiling-tool-droid/ developed by the National Archives of the UK]. Written in Java, XML.<br />
<br />
<br />
=== PRONOM ===<br />
The format information held in PRONOM is what powers [[DROID (Digital Record Object Identification)]]. Both are maintained by the [http://www.nationalarchives.gov.uk/ UK's National Archives].<br />
<br />
DROID downloads the latest [http://www.nationalarchives.gov.uk/aboutapps/pronom/droid-signature-files.htm signature files] from PRONOM, and those are used to drive the identification process. See the [http://www.nationalarchives.gov.uk/aboutapps/pronom/release-notes.xml PRONOM release notes].<br />
<br />
A number of other tools and registries have been based around the PRONOM data. These include:<br />
<br />
* [[Nanite]] and [[Fido]] identification tools<br />
* [[Siegfried]] identification tool<br />
<br />
Although the information and website are made freely available under the Open Government License, the underlying software engine that powers PRONOM is proprietary.<br />
<br />
<br />
===== The PRONOM Web API =====<br />
The website is oriented towards manual browsing, but note that each PRONOM registry entry as a permalink, like this: <br />
<br />
http://apps.nationalarchives.gov.uk/pronom/fmt/579<br />
<br />
and furthermore, by appending '.xml' to the URL for any entry, the data can be recovered as XML:<br />
<br />
http://apps.nationalarchives.gov.uk/pronom/fmt/579.xml<br />
= User Experiences =<br />
* [http://www.jisc.ac.uk/media/documents/programmes/preservation/daat_file_format_tools_report.pdf Digital Asset Assessment Tool - Assessment of file format testing tools].<br />
* Comparing how [[Apache Tika]] and DROID perform HTML identification: [http://britishlibrary.typepad.co.uk/webarchive/2014/07/how-much-of-the-uk-html-is-valid.html How much of the UK's HTML is valid?]<br />
* [http://openplanetsfoundation.org/blogs/2014-06-03-analysis-engine-droid-csv-export An Analysis Engine for the DROID CSV Export]<br />
* '''KOST-CECO:''' Used in [[KOST-Val]] for the file format identification. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.<br />
* '''FITS (File Information Tool Set):''' Used in [[FITS (File Information Tool Set)|FITS]]<br />
<br />
<br />
= Development Activity =<br />
All development activity is visible on GitHub: http://github.com/digital-preservation/droid/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/digital-preservation/droid/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/digital-preservation/droid/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=droid<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=KOST-Val&diff=3052KOST-Val2017-10-30T10:03:53Z<p>Chlara: Funtional Principle: Link to pdfaPilot</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).<br />
|image=KOST-Val.JPG<br />
|formats_in={{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} and SIP<br />
|homepage=http://kost-ceco.ch/cms/index.php?kost_val_de<br />
|license=[http://www.gnu.org/licenses/quick-guide-gplv3.html GNU General Public License 3+]<br />
|platforms= should run under Java 1.6 on Windows<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
= Description =<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
The KOST-Val application is used to validate {{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} files and Submission Information Package (SIP).<br />
<br />
KOST-Val supersedes the format validation tools [[SIARD-VAL]], [[TIFF-Val]] and SIP-Val by KOST-CECO.<br />
<br />
<br />
=== Funtional Principle ===<br />
KOST-Val complies with the following requirements.<br />
<br />
* '''TIFF validation:''' KOST-Val reads a TIFF file and uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] to validate the structure, the content, and [[ExifTool|ExifTool]] to validate the key properties such as compression, colour space, and multipage. These properties can be configured. <br />
* '''SIARD validation:''' KOST-Val reads a SIARD (eCH-0165 v1 and v2-2017) file and validates the structure and the content. <br />
* '''PDF/A validation:''' KOST-Val reads a PDF or PDF/A file (ISO 19005-1 and 19005-2) and uses [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] by PDF-Tools or [[PdfaPilot|pdfaPilot]] by callas to validate the structure and the content of the PDF file. KOST-Val organises the different error messages into main categories such as fonts, graphics, and metadata. KOST-Val supplies only a limited version from 3-Heights™ PDF/A Validator by PDF-Tools and pdfaPilot by callas. Module J extracts (with [[IText|iText]]) and validates the JPEG and JP2 images contained in the PDF file (depending on the configuration). It is also possible to configure whether the JBIG2 compression is accepted or not.<br />
* '''JP2 validation:''' KOST-Val reads a JP2 file (ISO 15444) and uses [[Jpylyzer]] to validate the structure and the content. <br />
* '''JPEG validation:''' KOST-Val reads a JPEG file (ISO 10918-1) and uses [[Bad Peggy]] to validate the structure and the content. <br />
* '''SIP validation:''' KOST-Val reads an SIP (eCH-0160 v1 and v1.1 ) and validates the mandatory requirements of the SIP specification. The validated requirements are organised into groups such as folder structure, schema validation, and checksum validation. At the outset, a file format validation is performed. <br />
<br />
The results (including information on inconsistencies and errors) are output for every step and written into a validation log.<br />
The validation steps are executed sequentially. Whenever possible the validation shall continue after an error has been detected in order to reduce the number of correction cycles. <br />
<br />
[[File:KOST-Val_FuntionalPrincipleFormatValidation.JPG|800px]]<br />
<br />
=== Third-party applications ===<br />
KOST-Val uses unmodified components of other manufacturers by embedding them directly into the source code. Users of KOST-Val are requested to adhere to these components ‘terms of licence. <br />
<br />
* The TIFF validation module uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] and [[ExifTool|ExifTool]] and evaluates its output further.<br />
* For the PDF/A validation module [[PDFTron PDF-A Manager|PDF-A Manager]] or [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] are used.<br />
* The JP2 validation module uses [[Jpylyzer]] and translates the failed tests into appropriate error messages (DE/FR/EN).<br />
* The JPEG validation module uses [[Bad Peggy]] and evaluates the error message "Not a JPEG file" further.<br />
* To extract the JPEG and JP2 images from PDF/A the [[IText|iText library]] is used. <br />
* For the file format identification [[DROID_(Digital_Record_Object_Identification)|DROID]] is used. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.<br />
<br />
<br />
=== Read Me & Download ===<br />
The KOST-Val application is used to validate TIFF, SIARD, PDF/A, JP2, JPEG files and Submission Information Package (SIP).<br />
<br />
KOST-Val, Copyright (C) 2012-2017 Claire Roethlisberger (KOST-CECO), Christian Eugster, Olivier Debenath, Peter Schneider (Staatsarchiv Aargau), Markus Hahn (coderslagoon), Daniel Ludin (BEDAG AG)<br />
<br />
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see GPL-3.0_COPYING.txt for details.<br />
<br />
You can download KOST-Val under http://github.com/KOST-CECO/KOST-Val/releases. For installation instructions please check the [http://github.com/KOST-CECO/KOST-Val/releases manual (DE/FR/EN)].<br />
<br />
<br />
=== SIARD format ===<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification http://www.ech.ch/vechweb/page?p=dossier&documentNumber=eCH-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations.<br />
<br />
<br />
=== Future ===<br />
See http://github.com/KOST-CECO/KOST-Val/issues <br />
<br />
<br />
=== Feedback & Issues ===<br />
Feedback about KOST-Val is very welcome at http://github.com/KOST-CECO/KOST-Val/issues or kost-val[at]kost-ceco.ch<br />
<br />
.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->* '''ZBW:'''<br />
** KOST-Val v1.6.0<br />
** The tool is very easy to install and to handle. <br />
** The output in xml-Format (open in a browser to have a table) is easy to understand<br />
** Running the JPEG-Module against almost 2,400 JPEGs has only lasted 7 minutes<br />
** The tool recognises fake-JPEGs (jpeg-extension, but no jpeg inside) and issues with jpegs and can differentiate easily between these two cases.<br />
<br />
.<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/KOST-CECO/KOST-Val/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/KOST-CECO/KOST-Val/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/KOST-CECO/KOST-Val/commits/master.atom</rss><br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=PdfaPilot&diff=3051PdfaPilot2017-10-30T10:01:31Z<p>Chlara: Description & Development Activity</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=pdfaPilot: Conversion of documents and emails into robust, searchable PDF or PDF/A files<br />
|image=pdfaPilot.jpg<br />
|homepage=https://www.callassoftware.com/en/products/pdfapilot<br />
|license=Commercially licensed product<br />
|platforms=Versions available for Windows and Mac<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:Validation]]<br />
[[Category:Document]]<br />
[[Category:Metadata Extraction]]<br />
[[Category:File_Format_Migration]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Convert documents and emails into PDF or PDF/A files to make them last a lifetime.<br />
<br />
Quoted from the [https://www.callassoftware.com/en/products/pdfapilot callas homepage]: <br />
<br />
" <br />
<br />
=== Make your documents and emails last a lifetime === <br />
In many environments, there are regulatory requirements that all communication regarding specific topics has to be archived for a certain period of time. How long that is can differ widely – from a few years to multiple tens of years.<br />
<br />
The format of choice for archiving is defined in the ISO standard PDF/A (PDF for Archiving). This ISO standard defines three different PDF/A versions and callas pdfaPilot can automatically convert all of your documents and emails into the PDF/A flavor required for your purposes. Additional job information (metadata) is fully supported and the solution is configurable enough to let it adapt to your workflow instead of the other way around.<br />
<br />
Archiving documents is a critical operation of modern workflows; for that reason callas pdfaPilot has been extensively tested and complies with the ISO standards as verified with for example the Isartor test suite. The fact that the very same technology is used in the de facto standard, Adobe Acrobat, is testament to its quality.<br />
<br />
<br />
=== Archiving typical office documents ===<br />
The best way to have good PDF/A files is to properly create them to begin with. For this, callas pdfaPilot can automatically convert Microsoft Word, Excel, PowerPoint, Project, Publisher and Visio files into quality PDF files for you. Simply drag them on the document window of callas pdfaPilot Desktop and the conversion takes place in the best possible way.<br />
<br />
OpenOffice documents are fully supported too and on Mac OS X Pages and Keynote of course work as well. Using the integrated Adobe engine, callas pdfaPilot even does quality conversion of EPS and PostScript files for you, without requiring Adobe Acrobat Distiller to be installed.<br />
<br />
<br />
=== Email archiving ===<br />
These days, emails are an essential part of business communication in most organizations. Many countries have laws and regulations around the archival of such communication, either to be used in potential future litigation or as part of sound accounting practices.<br />
<br />
Because emails are usually handled by email servers with limited storage capacities, there are no guarantees that an email you receive today will still be on the server ten years from now. When you download the email from your server it is possibly converted into the proprietary format of the email client, and there is no guarantee that you will have such a client in the future. In addition, there are attachments to the emails that require a proper viewer for all file formats that may occur here. Luckily callas pdfaPilot lets you store these emails including all attachments in a PDF/A format that will be available and readable many years from now.<br />
<br />
But even in environments where such regulatory guidelines don’t apply, there are good reasons to archive emails. In our modern society an amazing amount of business intelligence is captured in emails. Being able to maintain that information in a structured way and in the same format that is used for other documents and being able to efficiently search all of those documents including emails and their attachments efficiently will only become more important.<br />
<br />
<br />
=== Handle EPUB and PDF/UA ===<br />
Thanks to its unique checking feature for tagging structure in a PDF file callas pdfaPilot also offers optimized exporting to PDF/UA, the standard for accessibility. More and more legislation requires documents to be universally accessible for everyone, including people with physical disabilities. As a result, checking against the PDF/UA standard has gained importance for service providers, governments and enterprise customers alike.<br />
<br />
But pdfaPilot lets you take advantage of PDF tagging in a very different area as well:<br />
The tagging structure can be used in order to automatically create EPUB files from PDF. This PDF to EPUB feature converts PDFs into eBook files that can immediately be used on mobile devices such as smartphones or tablets.<br />
<br />
<br />
=== System requirements ===<br />
* Mac (Intel): macOS, version 10.7 or newer, 64-bit-compliant<br />
* Windows:<br />
** Windows 7 or newer<br />
** Windows Server 2008 R2 or newer<br />
<br />
<br />
=== Key features ===<br />
* Checks PDFs for compliance to the PDF/A-1, PDF/A-2 and PDF/A-3 standard (ISO 19005-1, 19005-2 and 19005-3)<br />
* Converts PDF files to PDF/A-1, PDF/A-2 or PDF/A-3 and implements all necessary corrections<br />
* Converts emails into PDF/A<br />
* Converts PDF files to EPUB and HTML<br />
* Supports all PDF/A conformance levels (from PDF/A-1b to PDF/A-3u)<br />
* Support of PDF/A-3 standard<br />
* Improves PDFs to facilitate the creation of PDF/UA files<br />
* Embeds and/or substitutes missing fonts and handles missing glyphs<br />
* Optimizes all color data for compliance with the PDF/A-1 standard<br />
* Adjusts comments and form fields to the defaults required by the PDF/A-1 standard<br />
* Removes unwanted attributes such as layers or interactive content like movies<br />
* Makes image compression compliant with the PDF/A-1... standard<br />
* Creates PDF/A documents that are web optimized for easier access and viewing<br />
* Brings document metadata in line with PDF/A-1... requirements<br />
* Saves newer PDF versions as PDF 1.4 as specified by PDF/A-1...<br />
* Implements adjustments and corrections without loss of information<br />
* Delivers clear reports to document all test and correction procedures<br />
* Improves overall accessibility of PDF/A<br />
* Is available in English, German, French, Italian, Spanish and Japanese language versions<br />
* PDF/A-3 collections may contain not just only other PDF/A files but arbitrary file formats like Word- or Excel-files or XML structures.<br />
* Guaranteed PDF/A conversion; this automatically tries different conversion methods to create the best PDF/A file possible.<br />
* Improved conversion of form fields and annotations, object level metadata in PDF/A-2 and the possibility to remove incompatible signatures when converting to PDF/A-2.<br />
* Creating, checking or processing ZUGFeRD invoices<br />
<br />
<br />
=== Get the most out of metadata ===<br />
* Provides convenient browsing of document metadata as well as object-level metadata for images embedded inside the document<br />
* Support for all relevant industry metadata standards including Dublin Core, IPTC, PRISM, GWG AdTicket/AdsML, PLUS, EXIF and Camera Raw<br />
* Facilitates metadata based researching in Yahoo, Wikipedia, Amazon, Google and AskMetaFilter<br />
* Supports GPS data for use with Google Maps, OpenStreetMap and Google Earth<br />
* XML export of all document and object level metadata for tracking image use and licenses<br />
* Define extension schemas for your custom metadata fields as required by PDF/A-1... standard with the new Metadata Extension Editor<br />
* Automatically embeds your company specific metadata schemas for reliable preservation of metadata inside your PDF/A files <br />
" <br />
<br />
=== Product variants ===<br />
Desktop, Server, CLI or SDK<br />
<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* Used by [[KOST-Val]] as a validation tool for PDF/A files.<br />
<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible in the [https://www.callassoftware.com/en/products/pdfapilot/?type=product&product=pdfapilotdesktop&tab=release-notes Release notes].<br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=PdfaPilot&diff=3050PdfaPilot2017-10-30T09:52:57Z<p>Chlara: User Experiences</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=pdfaPilot: Conversion of documents and emails into robust, searchable PDF or PDF/A files<br />
|image=pdfaPilot.jpg<br />
|homepage=https://www.callassoftware.com/en/products/pdfapilot<br />
|license=Commercially licensed product<br />
|platforms=Versions available for Windows and Mac<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:Validation]]<br />
[[Category:Document]]<br />
[[Category:Metadata Extraction]]<br />
[[Category:File_Format_Migration]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Convert documents and emails into PDF or PDF/A files to make them last a lifetime.<br />
<br />
Quoted from the [https://www.callassoftware.com/en/products/pdfapilot callas homepage]: <br />
<br />
" <br />
<br />
=== Make your documents and emails last a lifetime === <br />
In many environments, there are regulatory requirements that all communication regarding specific topics has to be archived for a certain period of time. How long that is can differ widely – from a few years to multiple tens of years.<br />
<br />
The format of choice for archiving is defined in the ISO standard PDF/A (PDF for Archiving). This ISO standard defines three different PDF/A versions and callas pdfaPilot can automatically convert all of your documents and emails into the PDF/A flavor required for your purposes. Additional job information (metadata) is fully supported and the solution is configurable enough to let it adapt to your workflow instead of the other way around.<br />
<br />
Archiving documents is a critical operation of modern workflows; for that reason callas pdfaPilot has been extensively tested and complies with the ISO standards as verified with for example the Isartor test suite. The fact that the very same technology is used in the de facto standard, Adobe Acrobat, is testament to its quality.<br />
<br />
<br />
=== Archiving typical office documents ===<br />
The best way to have good PDF/A files is to properly create them to begin with. For this, callas pdfaPilot can automatically convert Microsoft Word, Excel, PowerPoint, Project, Publisher and Visio files into quality PDF files for you. Simply drag them on the document window of callas pdfaPilot Desktop and the conversion takes place in the best possible way.<br />
<br />
OpenOffice documents are fully supported too and on Mac OS X Pages and Keynote of course work as well. Using the integrated Adobe engine, callas pdfaPilot even does quality conversion of EPS and PostScript files for you, without requiring Adobe Acrobat Distiller to be installed.<br />
<br />
<br />
=== Email archiving ===<br />
These days, emails are an essential part of business communication in most organizations. Many countries have laws and regulations around the archival of such communication, either to be used in potential future litigation or as part of sound accounting practices.<br />
<br />
Because emails are usually handled by email servers with limited storage capacities, there are no guarantees that an email you receive today will still be on the server ten years from now. When you download the email from your server it is possibly converted into the proprietary format of the email client, and there is no guarantee that you will have such a client in the future. In addition, there are attachments to the emails that require a proper viewer for all file formats that may occur here. Luckily callas pdfaPilot lets you store these emails including all attachments in a PDF/A format that will be available and readable many years from now.<br />
<br />
But even in environments where such regulatory guidelines don’t apply, there are good reasons to archive emails. In our modern society an amazing amount of business intelligence is captured in emails. Being able to maintain that information in a structured way and in the same format that is used for other documents and being able to efficiently search all of those documents including emails and their attachments efficiently will only become more important.<br />
<br />
<br />
=== Handle EPUB and PDF/UA ===<br />
Thanks to its unique checking feature for tagging structure in a PDF file callas pdfaPilot also offers optimized exporting to PDF/UA, the standard for accessibility. More and more legislation requires documents to be universally accessible for everyone, including people with physical disabilities. As a result, checking against the PDF/UA standard has gained importance for service providers, governments and enterprise customers alike.<br />
<br />
But pdfaPilot lets you take advantage of PDF tagging in a very different area as well:<br />
The tagging structure can be used in order to automatically create EPUB files from PDF. This PDF to EPUB feature converts PDFs into eBook files that can immediately be used on mobile devices such as smartphones or tablets.<br />
<br />
<br />
=== System requirements ===<br />
* Mac (Intel): macOS, version 10.7 or newer, 64-bit-compliant<br />
* Windows:<br />
** Windows 7 or newer<br />
** Windows Server 2008 R2 or newer<br />
" <br />
<br />
=== Product variants ===<br />
Desktop, Server, CLI or SDK<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* Used by [[KOST-Val]] as a validation tool for PDF/A files.<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=PdfaPilot&diff=3048PdfaPilot2017-10-30T08:58:02Z<p>Chlara: Creation: pdfaPilot</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=pdfaPilot: Conversion of documents and emails into robust, searchable PDF or PDF/A files<br />
|image=pdfaPilot.jpg<br />
|homepage=https://www.callassoftware.com/en/products/pdfapilot<br />
|license=Commercially licensed product<br />
|platforms=Versions available for Windows and Mac<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:Validation]]<br />
[[Category:Document]]<br />
[[Category:Metadata Extraction]]<br />
[[Category:File_Format_Migration]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=File:PdfaPilot.jpg&diff=3047File:PdfaPilot.jpg2017-10-30T08:57:23Z<p>Chlara: https://www.callassoftware.com/en/products/pdfapilot</p>
<hr />
<div>https://www.callassoftware.com/en/products/pdfapilot</div>Chlarahttps://coptr.digipres.org/index.php?title=KOST-Val&diff=3046KOST-Val2017-10-25T09:00:15Z<p>Chlara: Formatting</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).<br />
|image=KOST-Val.JPG<br />
|formats_in={{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} and SIP<br />
|homepage=http://kost-ceco.ch/cms/index.php?kost_val_de<br />
|license=[http://www.gnu.org/licenses/quick-guide-gplv3.html GNU General Public License 3+]<br />
|platforms= should run under Java 1.6 on Windows<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
= Description =<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
The KOST-Val application is used to validate {{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} files and Submission Information Package (SIP).<br />
<br />
KOST-Val supersedes the format validation tools [[SIARD-VAL]], [[TIFF-Val]] and SIP-Val by KOST-CECO.<br />
<br />
<br />
=== Funtional Principle ===<br />
KOST-Val complies with the following requirements.<br />
<br />
* '''TIFF validation:''' KOST-Val reads a TIFF file and uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] to validate the structure, the content, and [[ExifTool|ExifTool]] to validate the key properties such as compression, colour space, and multipage. These properties can be configured. <br />
* '''SIARD validation:''' KOST-Val reads a SIARD (eCH-0165 v1 and v2-2017) file and validates the structure and the content. <br />
* '''PDF/A validation:''' KOST-Val reads a PDF or PDF/A file (ISO 19005-1 and 19005-2) and uses [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] by PDF-Tools or pdfaPilot by callas to validate the structure and the content of the PDF file. KOST-Val organises the different error messages into main categories such as fonts, graphics, and metadata. KOST-Val supplies only a limited version from 3-Heights™ PDF/A Validator by PDF-Tools and pdfaPilot by callas. Module J extracts (with [[IText|iText]]) and validates the JPEG and JP2 images contained in the PDF file (depending on the configuration). It is also possible to configure whether the JBIG2 compression is accepted or not.<br />
* '''JP2 validation:''' KOST-Val reads a JP2 file (ISO 15444) and uses [[Jpylyzer]] to validate the structure and the content. <br />
* '''JPEG validation:''' KOST-Val reads a JPEG file (ISO 10918-1) and uses [[Bad Peggy]] to validate the structure and the content. <br />
* '''SIP validation:''' KOST-Val reads an SIP (eCH-0160 v1 and v1.1 ) and validates the mandatory requirements of the SIP specification. The validated requirements are organised into groups such as folder structure, schema validation, and checksum validation. At the outset, a file format validation is performed. <br />
<br />
The results (including information on inconsistencies and errors) are output for every step and written into a validation log.<br />
The validation steps are executed sequentially. Whenever possible the validation shall continue after an error has been detected in order to reduce the number of correction cycles. <br />
<br />
[[File:KOST-Val_FuntionalPrincipleFormatValidation.JPG|800px]]<br />
<br />
<br />
=== Third-party applications ===<br />
KOST-Val uses unmodified components of other manufacturers by embedding them directly into the source code. Users of KOST-Val are requested to adhere to these components ‘terms of licence. <br />
<br />
* The TIFF validation module uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] and [[ExifTool|ExifTool]] and evaluates its output further.<br />
* For the PDF/A validation module [[PDFTron PDF-A Manager|PDF-A Manager]] or [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] are used.<br />
* The JP2 validation module uses [[Jpylyzer]] and translates the failed tests into appropriate error messages (DE/FR/EN).<br />
* The JPEG validation module uses [[Bad Peggy]] and evaluates the error message "Not a JPEG file" further.<br />
* To extract the JPEG and JP2 images from PDF/A the [[IText|iText library]] is used. <br />
* For the file format identification [[DROID_(Digital_Record_Object_Identification)|DROID]] is used. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.<br />
<br />
<br />
=== Read Me & Download ===<br />
The KOST-Val application is used to validate TIFF, SIARD, PDF/A, JP2, JPEG files and Submission Information Package (SIP).<br />
<br />
KOST-Val, Copyright (C) 2012-2017 Claire Roethlisberger (KOST-CECO), Christian Eugster, Olivier Debenath, Peter Schneider (Staatsarchiv Aargau), Markus Hahn (coderslagoon), Daniel Ludin (BEDAG AG)<br />
<br />
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see GPL-3.0_COPYING.txt for details.<br />
<br />
You can download KOST-Val under http://github.com/KOST-CECO/KOST-Val/releases. For installation instructions please check the [http://github.com/KOST-CECO/KOST-Val/releases manual (DE/FR/EN)].<br />
<br />
<br />
=== SIARD format ===<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification http://www.ech.ch/vechweb/page?p=dossier&documentNumber=eCH-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations.<br />
<br />
<br />
=== Future ===<br />
See http://github.com/KOST-CECO/KOST-Val/issues <br />
<br />
<br />
=== Feedback & Issues ===<br />
Feedback about KOST-Val is very welcome at http://github.com/KOST-CECO/KOST-Val/issues or kost-val[at]kost-ceco.ch<br />
<br />
.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->* '''ZBW:'''<br />
** KOST-Val v1.6.0<br />
** The tool is very easy to install and to handle. <br />
** The output in xml-Format (open in a browser to have a table) is easy to understand<br />
** Running the JPEG-Module against almost 2,400 JPEGs has only lasted 7 minutes<br />
** The tool recognises fake-JPEGs (jpeg-extension, but no jpeg inside) and issues with jpegs and can differentiate easily between these two cases.<br />
<br />
.<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/KOST-CECO/KOST-Val/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/KOST-CECO/KOST-Val/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/KOST-CECO/KOST-Val/commits/master.atom</rss><br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=KOST-Val&diff=3045KOST-Val2017-10-25T08:57:36Z<p>Chlara: Update Read Me & Download</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).<br />
|image=KOST-Val.JPG<br />
|formats_in={{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} and SIP<br />
|homepage=http://kost-ceco.ch/cms/index.php?kost_val_de<br />
|license=[http://www.gnu.org/licenses/quick-guide-gplv3.html GNU General Public License 3+]<br />
|platforms= should run under Java 1.6 on Windows<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
= Description =<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
The KOST-Val application is used to validate {{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} files and Submission Information Package (SIP).<br />
<br />
KOST-Val supersedes the format validation tools [[SIARD-VAL]], [[TIFF-Val]] and SIP-Val by KOST-CECO.<br />
<br />
<br />
=== Funtional Principle ===<br />
KOST-Val complies with the following requirements.<br />
<br />
* '''TIFF validation:''' KOST-Val reads a TIFF file and uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] to validate the structure, the content, and [[ExifTool|ExifTool]] to validate the key properties such as compression, colour space, and multipage. These properties can be configured. <br />
* '''SIARD validation:''' KOST-Val reads a SIARD (eCH-0165 v1 and v2-2017) file and validates the structure and the content. <br />
* '''PDF/A validation:''' KOST-Val reads a PDF or PDF/A file (ISO 19005-1 and 19005-2) and uses [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] by PDF-Tools or pdfaPilot by callas to validate the structure and the content of the PDF file. KOST-Val organises the different error messages into main categories such as fonts, graphics, and metadata. KOST-Val supplies only a limited version from 3-Heights™ PDF/A Validator by PDF-Tools and pdfaPilot by callas. Module J extracts (with [[IText|iText]]) and validates the JPEG and JP2 images contained in the PDF file (depending on the configuration). It is also possible to configure whether the JBIG2 compression is accepted or not.<br />
* '''JP2 validation:''' KOST-Val reads a JP2 file (ISO 15444) and uses [[Jpylyzer]] to validate the structure and the content. <br />
* '''JPEG validation:''' KOST-Val reads a JPEG file (ISO 10918-1) and uses [[Bad Peggy]] to validate the structure and the content. <br />
* '''SIP validation:''' KOST-Val reads an SIP (eCH-0160 v1 and v1.1 ) and validates the mandatory requirements of the SIP specification. The validated requirements are organised into groups such as folder structure, schema validation, and checksum validation. At the outset, a file format validation is performed. <br />
<br />
The results (including information on inconsistencies and errors) are output for every step and written into a validation log.<br />
The validation steps are executed sequentially. Whenever possible the validation shall continue after an error has been detected in order to reduce the number of correction cycles. <br />
<br />
[[File:KOST-Val_FuntionalPrincipleFormatValidation.JPG|800px]]<br />
<br />
=== Third-party applications ===<br />
KOST-Val uses unmodified components of other manufacturers by embedding them directly into the source code. Users of KOST-Val are requested to adhere to these components ‘terms of licence. <br />
<br />
* The TIFF validation module uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] and [[ExifTool|ExifTool]] and evaluates its output further.<br />
* For the PDF/A validation module [[PDFTron PDF-A Manager|PDF-A Manager]] or [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] are used.<br />
* The JP2 validation module uses [[Jpylyzer]] and translates the failed tests into appropriate error messages (DE/FR/EN).<br />
* The JPEG validation module uses [[Bad Peggy]] and evaluates the error message "Not a JPEG file" further.<br />
* To extract the JPEG and JP2 images from PDF/A the [[IText|iText library]] is used. <br />
* For the file format identification [[DROID_(Digital_Record_Object_Identification)|DROID]] is used. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.<br />
<br />
<br />
=== Read Me & Download ===<br />
The KOST-Val application is used to validate TIFF, SIARD, PDF/A, JP2, JPEG files and Submission Information Package (SIP).<br />
<br />
KOST-Val, Copyright (C) 2012-2017 Claire Roethlisberger (KOST-CECO), Christian Eugster, Olivier Debenath, Peter Schneider (Staatsarchiv Aargau), Markus Hahn (coderslagoon), Daniel Ludin (BEDAG AG)<br />
<br />
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see GPL-3.0_COPYING.txt for details.<br />
<br />
You can download KOST-Val under http://github.com/KOST-CECO/KOST-Val/releases. For installation instructions please check the [http://github.com/KOST-CECO/KOST-Val/releases manual (DE/FR/EN)].<br />
<br />
=== SIARD format ===<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification http://www.ech.ch/vechweb/page?p=dossier&documentNumber=eCH-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations.<br />
<br />
<br />
=== Future ===<br />
See http://github.com/KOST-CECO/KOST-Val/issues <br />
<br />
<br />
=== Feedback & Issues ===<br />
Feedback about KOST-Val is very welcome at http://github.com/KOST-CECO/KOST-Val/issues or kost-val[at]kost-ceco.ch<br />
<br />
.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->* '''ZBW:'''<br />
** KOST-Val v1.6.0<br />
** The tool is very easy to install and to handle. <br />
** The output in xml-Format (open in a browser to have a table) is easy to understand<br />
** Running the JPEG-Module against almost 2,400 JPEGs has only lasted 7 minutes<br />
** The tool recognises fake-JPEGs (jpeg-extension, but no jpeg inside) and issues with jpegs and can differentiate easily between these two cases.<br />
<br />
.<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/KOST-CECO/KOST-Val/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/KOST-CECO/KOST-Val/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/KOST-CECO/KOST-Val/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=KOST-Val&diff=3044KOST-Val2017-10-25T08:52:25Z<p>Chlara: Update Funtional Principle (callas and sip)</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).<br />
|image=KOST-Val.JPG<br />
|formats_in={{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} and SIP<br />
|homepage=http://kost-ceco.ch/cms/index.php?kost_val_de<br />
|license=[http://www.gnu.org/licenses/quick-guide-gplv3.html GNU General Public License 3+]<br />
|platforms= should run under Java 1.6 on Windows<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
= Description =<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
The KOST-Val application is used to validate {{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} files and Submission Information Package (SIP).<br />
<br />
KOST-Val supersedes the format validation tools [[SIARD-VAL]], [[TIFF-Val]] and SIP-Val by KOST-CECO.<br />
<br />
<br />
=== Funtional Principle ===<br />
KOST-Val complies with the following requirements.<br />
<br />
* '''TIFF validation:''' KOST-Val reads a TIFF file and uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] to validate the structure, the content, and [[ExifTool|ExifTool]] to validate the key properties such as compression, colour space, and multipage. These properties can be configured. <br />
* '''SIARD validation:''' KOST-Val reads a SIARD (eCH-0165 v1 and v2-2017) file and validates the structure and the content. <br />
* '''PDF/A validation:''' KOST-Val reads a PDF or PDF/A file (ISO 19005-1 and 19005-2) and uses [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] by PDF-Tools or pdfaPilot by callas to validate the structure and the content of the PDF file. KOST-Val organises the different error messages into main categories such as fonts, graphics, and metadata. KOST-Val supplies only a limited version from 3-Heights™ PDF/A Validator by PDF-Tools and pdfaPilot by callas. Module J extracts (with [[IText|iText]]) and validates the JPEG and JP2 images contained in the PDF file (depending on the configuration). It is also possible to configure whether the JBIG2 compression is accepted or not.<br />
* '''JP2 validation:''' KOST-Val reads a JP2 file (ISO 15444) and uses [[Jpylyzer]] to validate the structure and the content. <br />
* '''JPEG validation:''' KOST-Val reads a JPEG file (ISO 10918-1) and uses [[Bad Peggy]] to validate the structure and the content. <br />
* '''SIP validation:''' KOST-Val reads an SIP (eCH-0160 v1 and v1.1 ) and validates the mandatory requirements of the SIP specification. The validated requirements are organised into groups such as folder structure, schema validation, and checksum validation. At the outset, a file format validation is performed. <br />
<br />
The results (including information on inconsistencies and errors) are output for every step and written into a validation log.<br />
The validation steps are executed sequentially. Whenever possible the validation shall continue after an error has been detected in order to reduce the number of correction cycles. <br />
<br />
[[File:KOST-Val_FuntionalPrincipleFormatValidation.JPG|800px]]<br />
<br />
=== Third-party applications ===<br />
KOST-Val uses unmodified components of other manufacturers by embedding them directly into the source code. Users of KOST-Val are requested to adhere to these components ‘terms of licence. <br />
<br />
* The TIFF validation module uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] and [[ExifTool|ExifTool]] and evaluates its output further.<br />
* For the PDF/A validation module [[PDFTron PDF-A Manager|PDF-A Manager]] or [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] are used.<br />
* The JP2 validation module uses [[Jpylyzer]] and translates the failed tests into appropriate error messages (DE/FR/EN).<br />
* The JPEG validation module uses [[Bad Peggy]] and evaluates the error message "Not a JPEG file" further.<br />
* To extract the JPEG and JP2 images from PDF/A the [[IText|iText library]] is used. <br />
* For the file format identification [[DROID_(Digital_Record_Object_Identification)|DROID]] is used. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.<br />
<br />
<br />
=== Read Me & Download ===<br />
The KOST-Val application is used to validate TIFF, SIARD, PDF/A, JP2, JPEG files and Submission Information Package (SIP).<br />
<br />
KOST-Val, Copyright (C) 2012-2015 Claire Roethlisberger (KOST-CECO), Christian Eugster, Olivier Debenath, Peter Schneider (Staatsarchiv Aargau), Markus Hahn (coderslagoon), Daniel Ludin (BEDAG AG)<br />
<br />
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see GPL-3.0_COPYING.txt for details.<br />
<br />
You can download KOST-Val under http://github.com/KOST-CECO/KOST-Val/releases. For installation instructions please check the [http://github.com/KOST-CECO/KOST-Val/releases manual (DE/FR/EN)].<br />
<br />
<br />
=== SIARD format ===<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification http://www.ech.ch/vechweb/page?p=dossier&documentNumber=eCH-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations.<br />
<br />
<br />
=== Future ===<br />
See http://github.com/KOST-CECO/KOST-Val/issues <br />
<br />
<br />
=== Feedback & Issues ===<br />
Feedback about KOST-Val is very welcome at http://github.com/KOST-CECO/KOST-Val/issues or kost-val[at]kost-ceco.ch<br />
<br />
.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->* '''ZBW:'''<br />
** KOST-Val v1.6.0<br />
** The tool is very easy to install and to handle. <br />
** The output in xml-Format (open in a browser to have a table) is easy to understand<br />
** Running the JPEG-Module against almost 2,400 JPEGs has only lasted 7 minutes<br />
** The tool recognises fake-JPEGs (jpeg-extension, but no jpeg inside) and issues with jpegs and can differentiate easily between these two cases.<br />
<br />
.<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/KOST-CECO/KOST-Val/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/KOST-CECO/KOST-Val/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/KOST-CECO/KOST-Val/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=PDFTron_PDF-A_Manager&diff=3043PDFTron PDF-A Manager2017-10-25T08:46:10Z<p>Chlara: Not longer used by KOST-Val as a validation tool for PDF/A files.</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=PDF/A Manager is a PDF/A (ISO 19005) validation and conversion software. <br />
|image=<br />
|homepage=http://www.pdftron.com/pdfamanager/index.html<br />
|license=Commercially licensed product, developers licenses are free.<br />
|platforms=Versions available for Windows, Mac, and linux.<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Validation]]<br />
[[Category:File Format Migration]]<br />
[[Category:Document]]<br />
<br />
<br />
= Description =<br />
A tool that converts PDF documents to PDF/A (Version 1, 2 and 3 and Level A, B and U) documents, or validate a PDF file against the PDF/A specification. Versions available for Windows, Mac, and linux.<br />
<br />
=== Product variants ===<br />
Command-line and software development kit (SDK).<br />
<br />
= User Experiences =<br />
<br />
= Development Activity =<br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=PDFTron PDF-A Manager<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=File:KOST-Val_FuntionalPrincipleFormatValidation.JPG&diff=3042File:KOST-Val FuntionalPrincipleFormatValidation.JPG2017-10-25T08:17:31Z<p>Chlara: Chlara uploaded a new version of &quot;File:KOST-Val FuntionalPrincipleFormatValidation.JPG&quot;: pdfaPilot von callas anstelle PDF/A-Manager von PDFTron</p>
<hr />
<div>KOST-Val Funtional Principle of Format Validation</div>Chlarahttps://coptr.digipres.org/index.php?title=Aaru_Data_Preservation_Suite&diff=3039Aaru Data Preservation Suite2017-09-14T10:03:46Z<p>Chlara: Development Activity</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=Media dump software and disc image manager<br />
|homepage=http://github.com/claunia/DiscImageChef<br />
|license=GPL and LGPL<br />
|platforms=Windows, Linux, macOS, FreeBSD<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
[[Category:Metadata Extraction]]<br />
[[Category:Preservation System]]<br />
[[Category:Backup]]<br />
[[Category:Disk Image]]<br />
[[Category:Disk Imaging]]<br />
[[Category:Software]]<br />
[[Category:Tools]]<br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
DiscImageChef is a tool designed to manage disc images created from any kind of media (optical, block, etc).<br />
<br />
It can dump media from CD, DVD and Blu-ray drives (aka optical media), as well as almost any storage device that connects via SCSI, ATA, SATA, USB or FireWire.<br />
<br />
It can also read and analyze several disk image formats from different manufacturers, for checksumming, comparison (even between different formats), or creating XML metadata for database comsuption.<br />
<br />
Last but not least it can recognize and identify almost all known partition table formats and file systems and show information about them, with support for accessing the files contained in those file systems getting added steadily, with a priority on archaic and uncommon filesystems, being the first software able to read the Lisa filesystem after its original operating system was deprecated.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
*https://www.betaarchive.com/forum/viewtopic.php?f=72&t=36078<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
The tool is LGPL for the data portions and GPL for the user interface with all development.<br />
All development activity is visible on GitHub: http://github.com/claunia/DiscImageChef/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/claunia/DiscImageChef/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/claunia/DiscImageChef/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Webrecorder&diff=3038Webrecorder2017-09-14T09:54:29Z<p>Chlara: Development Activity- Activity Feed</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=Webrecorder is a hosted web archiving tool with which users can capture what they see as they browse websites and save that information (locally or to a free account) <br />
|homepage=https://webrecorder.io/<br />
|license=<br />
<br />
Webrecorder is an open-source software product (under the Apache License) and shared via [GitHub][https://github.com/webrecorder/webrecorder] <br />
<br />
|platforms=Platform agnostic, operates via web browser<br />
}}<br />
<br />
[[Category:Web Archive]] [[Category:Web]] <br />
<br />
== Description ==<br />
<br />
Webrecorder is a hosted web archiving tool with which users can capture what they see as they browse websites and save that information. Via a web browser Webrecorder collects content and data from web pages including: HTML, images, scripts, stylesheets, Flash, Java applets as well as video, audio and other elements used to make web pages and web apps. Webrecorder can capture dynamic web content that cannot be captured by most crawler-based web archiving tools. Webrecorder can record what you see when you are logged in to a social media profile (though it does not record site login credentials).<br />
<br />
One does not need to login to use Webrecorder to capture web content if the intent is to download the captures right away (as a WARC file) and save them locally. Desktop software that can open a WARC file, such as Webrecorder Player [https://github.com/webrecorder/webrecorderplayer-electron], is needed to view web archives downloaded from Webrecorder. Webrecorder Player is available at no charge and with this software you will be able to view all the content contained in a WARC file without being connected to the internet. For continued access to archived content online, and to be able to add to a collection, it is necessary to log in to a free account, which comes with 5 GB of storage (at least as of fall 2017).<br />
<br />
Webrecorder is a project of Rhizome [https://rhizome.org/] under its digital preservation program.<br />
<br />
== User Experiences ==<br />
<br />
*Happy accidents: adventures in web preservation [https://anoldhanddigital.wordpress.com/2017/08/09/happy-accidents-adventures-in-web-preservation/]<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/webrecorder/webrecorder/commits<br />
<br />
<!-- <br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/webrecorder/webrecorder/releases.atom</rss><br />
--><br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/webrecorder/webrecorder/commits/master.atom</rss><br />
<br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Talk:Main_Page&diff=3037Talk:Main Page2017-09-14T09:48:13Z<p>Chlara: https</p>
<hr />
<div>==Use of Level 1 section headings ==<br />
I just had a first stab at some editing an existing entry, and creating a new one. One small thing I noticed: all of the pages I've seen use Level 1 headings for the main sections, but this is discouraged by the MediaWiki documentation (esp. since Level 1 is already used for the main heading of each entry). See e.g. here:<br />
<br />
http://www.mediawiki.org/wiki/Help_talk:Formatting#Level_1<br />
<br />
And also:<br />
<br />
http://www.mediawiki.org/wiki/Help:Formatting<br />
<br />
So maybe it would be better to change those to level 2 (also in all cases I've seen the child sections use level 3 headings)? Not a big deal of course but it's a bit ugly and it's probably easier to fix this now than postponing it to a later stage (also existing entries will most likely be used as a template for creating new ones, so the problem will get progressively worse if left as it is). I suppose this is also something that might mess up attempts at automated text extraction. [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 14:58, 20 November 2013 (UTC)<br />
<br />
: Good point - I didn't know about this recommendation. It should be possible to automated this transformation using the [https://www.mediawiki.org/wiki/Manual:Pywikibot Pywikibot] framework and the [https://pypi.python.org/pypi/mwparserfromhell mwparserfromhell]. We could change the [[Template:Tool/Preload|tool template]] now and use a bot to modify the rest? [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 15:15, 20 November 2013 (UTC)<br />
<br />
::Great! It would also help to mention the template on the main page (I don't have edit rights there), as I didn't even know there was any! Anyway, I just updated the template myself.[[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 16:07, 20 November 2013 (UTC)<br />
<br />
::: I updated the main page to include an explicit link to the template that should be used when creating new Tool entries. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]])<br />
<br />
==Purpose field in infobox==<br />
Another thing I noticed is that for most entries the ''purpose'' field in the infobox more or less repeats what's already in the ''Description'' section. So maybe it's better to leave it out altogether (on the other hand it might be useful for automated discovery/analysis)? [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 14:58, 20 November 2013 (UTC)<br />
: The intention was the other way around - that the 'purpose' field was a brief description that might be published and re-used in other forms (due to being in the infobox), and the Description section should be more detailed. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 15:15, 20 November 2013 (UTC)<br />
:: Yes, that's what I thought as well. For some existing entries both are identical though, so it wasn't immediately clear. [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 15:56, 20 November 2013 (UTC)<br />
::: I would say the goal is to improve the body Description section for those items. Is the Description in these cases exactly identical to the purpose? If so, I might be able to check for that automatically. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 21:18, 20 November 2013 (UTC)<br />
<br />
==Perhaps rename Imaging category to Disk Imaging?==<br />
Another thing: people might end up mixing up the ''Image'' and ''Imaging'' categories. Maybe rename the latter to ''Disk Imaging'' to avoid this?<br />
[[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 16:15, 20 November 2013 (UTC)<br />
: Good idea. Unfortunately, I think that due to the way MediaWiki works, this means editing all the items tagged with that category (unless there's a trick I'm unaware of). However, perhaps I should tweak [[User:COPTR Bot|COPTR Bot]] to do this too. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 21:22, 20 November 2013 (UTC)<br />
:: Fixed manually --[[User:Prwheatley|Prwheatley]] ([[User talk:Prwheatley|talk]]) 17:51, 26 November 2013 (UTC)<br />
<br />
==Problems with login procedure (VeriSign)==<br />
Having created an OpenID with Verisign, signing in to COPTR is a bit hit and miss. It always takes multiple attempts before I can successfully log in, and I first have to work myself through multiple authorization errors. Not clear to me whether the error is caused by COPTR or VeriSign, and results seem to be a bit random. Also after some time of inactivity I seem to get logged out automatically. Confusingly, the COPTR login status at the top right then shows I'm still logged in (depending on which page I'm on, it seems), and I have to go through the whole login procedure (including failures) again. <br />
[[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 11:16, 21 November 2013 (UTC)<br />
<br />
: [http://meta.stackoverflow.com/a/1789 This StackOverflow answer] outlines some of the possible causes of problems with OpenID logins. I guess we could perhaps enable non-OpenID logins, although I wanted to avoid that for fear of making it even easier for the robots/spammers. Using Google as an OpenID provider is working fine for me. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 11:40, 21 November 2013 (UTC)<br />
<br />
:: Hmm ... the StackOverflow page links to an OpenID checker, but I'm a bit scared of it because it exposes all the info you enter there to a publicly viewable log ... Not keen on using Google either, because they're, well ... Google. The main point of course is that the success of COPTR will depend completely on community involvement, and if the login procedure is already such a stumbling block that's not going to help there. Also the procedure to get the OpenID isn't terribly straightforward, and this will scare off potential contributors as well. So I would consider adding a non-OpenID login procedure, perhaps augmented with a CAPTCHA to keep the robots at bay (e.g. like the Archiveteam Formats Wiki). [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 12:28, 21 November 2013 (UTC)<br />
<br />
::: Okay, I've enabled plain logins and registration. To make sure you retain the same user account, you should ensure it knows the right email address so you can use password recovery to login. Alternatively, I think you should be able to reset your PW from your Preferences page. The current setup is that you should see CAPTCHAs blocking all edits etc. unless you have confirmed your email address. We'll see how it goes, and if we have to tighten up things to avoid spam, so be it. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 12:45, 21 November 2013 (UTC)<br />
<br />
::::Great, thanks for the quick response! Apparently there's no way to set a pw from the Preferences page once you have an existing OpenID assigned to your user profile, and when I try to delete my OpenID I get an error that this isn't possible because no password is set, classic Catch 22 there! Anyway, I'll stick with the OpenID madness for a bit and if I get really fed up with it I may well just create a new COPTR account, probably the easiest solution all around. [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 13:14, 21 November 2013 (UTC)<br />
<br />
::::: I'm pretty sure that, if you completely log out first, and if there is an email associated with your account, you can use the 'Password reset' option to get access. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 15:28, 21 November 2013 (UTC)<br />
<br />
== changing Open Planets Foundation to Open Preservation Foundation ==<br />
<br />
Hi. The Main Page need some update (changing Open Planets Foundation to Open Preservation Foundation). I have of course not the permission to do that :-) . Thanks. [[User:Chlara|Chlara ]] ([[User talk:Chlara|talk]]) 10:18, 27 November 2014 (UTC)<br />
* Good call, this is now fixed. Thanks!<br />
<br />
== DSPS (Digital Preservation Software Platform) is the same as Digital Preservation Software Platform ==<br />
<br />
Hi. I have updated the Digital Preservation Software Platform. After that I saw that there is a nearly same Page DSPS (Digital Preservation Software Platform). Due to the fact, that "DSPS (Digital Preservation Software Platform)" is the better title I have copied the content from Digital Preservation Software Platform into DSPS (Digital Preservation Software Platform). In the Page Digital Preservation Software Platform I replaced the content with a link to DSPS (Digital Preservation Software Platform). <br />
<br />
Can you please delete the page Digital Preservation Software Platform.<br />
<br />
Thanks. [[User:Chlara|Chlara ]] ([[User talk:Chlara|talk]]) 11:30, 8 January 2015 (UTC)<br />
<br />
== Merging duplicate entries ==<br />
<br />
I noticed that there are two entries for the CONTENTdm digital asset management system:<br />
<br />
* http://coptr.digipres.org/ContentDM<br />
* http://coptr.digipres.org/CONTENTdm<br />
<br />
I removed the limited content from the ContentDM page and made it redirect to the CONTENTdm page, using the MediaWiki #redirect option (see [http://www.mediawiki.org/wiki/Help:Redirects http://www.mediawiki.org/wiki/Help:Redirects]. Should this be the recommended practice or would you rather simply delete one of the pages? If the latter, how to we request removal of a page?<br />
<br />
[[User:Danielle Plumer|Danielle Plumer]] ([[User talk:Danielle Plumer|talk]]) 03:00, 5 February 2015 (UTC)<br />
<br />
== Check this entries ==<br />
==== Workflow ====<br />
Hi. I Think the entrie http://coptr.digipres.org/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive need to be changed or deleted. What do you think about it? Thanks. [[User:Chlara|Chlara ]] ([[User talk:Chlara|talk]]) 06:19, 20 April 2017 (UTC)<br />
<br />
== Change to https ==<br />
Hi. COPTR should change to https. Thanks. --[[User:Chlara|Chlara ]] ([[User talk:Chlara|talk]]) 09:48, 14 September 2017 (UTC)</div>Chlarahttps://coptr.digipres.org/index.php?title=Goobi&diff=3026Goobi2017-05-23T14:14:33Z<p>Chlara: Development Activity with comits and releases</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=Workflow Management Tool<br />
|image={{PAGENAMEE}}.png<br />
|homepage=http://www.intranda.com/goobi<br />
|license=GPLv2<br />
|platforms=Linux<br />
}}<br />
<br />
<br />
[[Category:Workflow]]<br />
[[Category:OCR]]<br />
[[Category:Planning]]<br />
[[Category:Quality Assurance]]<br />
<br />
<br />
<br />
== Description ==<br />
Goobi is a web based workflow management tool. It can be used to create manual, partly automatic or fully automatically workflows to validate, transform or enrich content.<br />
<br />
<br />
== User Experiences ==<br />
* http://www.intranda.com/en/uber-uns/referenzen/<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/intranda/goobi/commits<br />
<br />
<br />
==== Release Feed ====<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/intranda/goobi/releases.atom</rss><br />
<br />
<br />
==== Activity Feed ====<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/intranda/goobi/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Exact_Audio_Copy&diff=3020Exact Audio Copy2017-04-20T06:37:23Z<p>Chlara: Link to existing 64px-Exact_Audio_Copy_Icon.png added</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=Exact Audio Copy is an audio grabber for audio CDs using standard CD and DVD-ROM drives on Windows only. <br />
|image=64px-Exact_Audio_Copy_Icon.png<br />
|homepage=http://www.exactaudiocopy.de<br />
|license=Proprietary, Freeware<br />
|platforms= Windows XP, Vista, 7, 8, 10. No Linux nor MAC version is planned at all. But it is reported that EAC runs in an emulation layer (WINE for Linux and Virtual PC Win 98 for MAC) <br />
}}<br />
<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:File Format Migration]]<br />
[[Category:Metadata Extraction]]<br />
[[Category:Disk Imaging]]<br />
[[Category:Audio]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Exact Audio Copy (EAC) has three options for copying media: Copy Selected Tracks Uncompressed, Copy Selected Track Compressed, and Copy Image and Cue Sheet.<br />
<br />
In secure mode EAC either reads every audio sector at least twice or relies on extended error information that some drives are able to return with the audio data. By using this technique non-identical sectors are detected. If an error occurs (read or sync error), the program keeps on reading this sector, until eight of 16 retries are identical, but at maximum one, three or five times (according to the selected error recovery quality) these 16 retries are read. So, in the worst case, bad sectors are read up to 82 times. Exact Audio Copy uses several technologies like multi-reading with verify and AccurateRip.<br />
<br />
==== Online Help ====<br />
http://wiki.hydrogenaud.io/index.php?title=Category:EAC_Guides<br />
<br />
http://www.digital-inn.de/forums/exact-audio-copy-english.14/<br />
<br />
==== Owner ====<br />
<br />
The software is owned and managed by Andre Wiethoff.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* http://campuspress.yale.edu/borndigital/2016/12/20/to-image-or-copy-the-compact-disc-digital-audio-dilemma/<br />
* http://journal.code4lib.org/articles/9581<br />
* https://archives.library.illinois.edu/staff-resources/digital-workflow/born-digital-workflow/accessioning/extract-from-source-media/ripping-audio-records-using-exact-audio-copy/<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on http://www.exactaudiocopy.de/en/index.php/resources/whats-new/whats-new/<br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=PRONOM_Signature_Development_Utility&diff=3019PRONOM Signature Development Utility2017-04-20T06:25:25Z<p>Chlara: Use the structure provided in this template, do not change it! -> "User Experiences" added</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=Output DROID compatible file format signature files using PRONOM syntax<br />
|image=<br />
|homepage=https://github.com/exponential-decay/signature-development-utility<br />
|license=Open source (see URL above)<br />
|platforms=PHP + JQuery + Javascript + text/html<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:File Format Identification]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Utility to enable the creation of DROID compatible signature files using PRONOM regular expression syntax. The tool outputs in an XML format compatible with DROID 4 upwards (including DROID 5 and 6). Three sequences can be combined to create a single file format signature. Signature files can be concatenated manually if more complex collections are required for testing. <br />
<br />
Signature File 88 contains of 1427 different PUIDs.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: https://github.com/exponential-decay/signature-development-utility/commits<br />
<br />
==== Activity Feed ====<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/exponential-decay/signature-development-utility/commits/master.atom</rss><br />
<br />
<!-- Add the Ohloh.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Talk:Main_Page&diff=3018Talk:Main Page2017-04-20T06:19:15Z<p>Chlara: Check this entries: Workflow</p>
<hr />
<div>==Use of Level 1 section headings ==<br />
I just had a first stab at some editing an existing entry, and creating a new one. One small thing I noticed: all of the pages I've seen use Level 1 headings for the main sections, but this is discouraged by the MediaWiki documentation (esp. since Level 1 is already used for the main heading of each entry). See e.g. here:<br />
<br />
http://www.mediawiki.org/wiki/Help_talk:Formatting#Level_1<br />
<br />
And also:<br />
<br />
http://www.mediawiki.org/wiki/Help:Formatting<br />
<br />
So maybe it would be better to change those to level 2 (also in all cases I've seen the child sections use level 3 headings)? Not a big deal of course but it's a bit ugly and it's probably easier to fix this now than postponing it to a later stage (also existing entries will most likely be used as a template for creating new ones, so the problem will get progressively worse if left as it is). I suppose this is also something that might mess up attempts at automated text extraction. [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 14:58, 20 November 2013 (UTC)<br />
<br />
: Good point - I didn't know about this recommendation. It should be possible to automated this transformation using the [https://www.mediawiki.org/wiki/Manual:Pywikibot Pywikibot] framework and the [https://pypi.python.org/pypi/mwparserfromhell mwparserfromhell]. We could change the [[Template:Tool/Preload|tool template]] now and use a bot to modify the rest? [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 15:15, 20 November 2013 (UTC)<br />
<br />
::Great! It would also help to mention the template on the main page (I don't have edit rights there), as I didn't even know there was any! Anyway, I just updated the template myself.[[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 16:07, 20 November 2013 (UTC)<br />
<br />
::: I updated the main page to include an explicit link to the template that should be used when creating new Tool entries. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]])<br />
<br />
==Purpose field in infobox==<br />
Another thing I noticed is that for most entries the ''purpose'' field in the infobox more or less repeats what's already in the ''Description'' section. So maybe it's better to leave it out altogether (on the other hand it might be useful for automated discovery/analysis)? [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 14:58, 20 November 2013 (UTC)<br />
: The intention was the other way around - that the 'purpose' field was a brief description that might be published and re-used in other forms (due to being in the infobox), and the Description section should be more detailed. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 15:15, 20 November 2013 (UTC)<br />
:: Yes, that's what I thought as well. For some existing entries both are identical though, so it wasn't immediately clear. [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 15:56, 20 November 2013 (UTC)<br />
::: I would say the goal is to improve the body Description section for those items. Is the Description in these cases exactly identical to the purpose? If so, I might be able to check for that automatically. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 21:18, 20 November 2013 (UTC)<br />
<br />
==Perhaps rename Imaging category to Disk Imaging?==<br />
Another thing: people might end up mixing up the ''Image'' and ''Imaging'' categories. Maybe rename the latter to ''Disk Imaging'' to avoid this?<br />
[[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 16:15, 20 November 2013 (UTC)<br />
: Good idea. Unfortunately, I think that due to the way MediaWiki works, this means editing all the items tagged with that category (unless there's a trick I'm unaware of). However, perhaps I should tweak [[User:COPTR Bot|COPTR Bot]] to do this too. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 21:22, 20 November 2013 (UTC)<br />
:: Fixed manually --[[User:Prwheatley|Prwheatley]] ([[User talk:Prwheatley|talk]]) 17:51, 26 November 2013 (UTC)<br />
<br />
==Problems with login procedure (VeriSign)==<br />
Having created an OpenID with Verisign, signing in to COPTR is a bit hit and miss. It always takes multiple attempts before I can successfully log in, and I first have to work myself through multiple authorization errors. Not clear to me whether the error is caused by COPTR or VeriSign, and results seem to be a bit random. Also after some time of inactivity I seem to get logged out automatically. Confusingly, the COPTR login status at the top right then shows I'm still logged in (depending on which page I'm on, it seems), and I have to go through the whole login procedure (including failures) again. <br />
[[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 11:16, 21 November 2013 (UTC)<br />
<br />
: [http://meta.stackoverflow.com/a/1789 This StackOverflow answer] outlines some of the possible causes of problems with OpenID logins. I guess we could perhaps enable non-OpenID logins, although I wanted to avoid that for fear of making it even easier for the robots/spammers. Using Google as an OpenID provider is working fine for me. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 11:40, 21 November 2013 (UTC)<br />
<br />
:: Hmm ... the StackOverflow page links to an OpenID checker, but I'm a bit scared of it because it exposes all the info you enter there to a publicly viewable log ... Not keen on using Google either, because they're, well ... Google. The main point of course is that the success of COPTR will depend completely on community involvement, and if the login procedure is already such a stumbling block that's not going to help there. Also the procedure to get the OpenID isn't terribly straightforward, and this will scare off potential contributors as well. So I would consider adding a non-OpenID login procedure, perhaps augmented with a CAPTCHA to keep the robots at bay (e.g. like the Archiveteam Formats Wiki). [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 12:28, 21 November 2013 (UTC)<br />
<br />
::: Okay, I've enabled plain logins and registration. To make sure you retain the same user account, you should ensure it knows the right email address so you can use password recovery to login. Alternatively, I think you should be able to reset your PW from your Preferences page. The current setup is that you should see CAPTCHAs blocking all edits etc. unless you have confirmed your email address. We'll see how it goes, and if we have to tighten up things to avoid spam, so be it. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 12:45, 21 November 2013 (UTC)<br />
<br />
::::Great, thanks for the quick response! Apparently there's no way to set a pw from the Preferences page once you have an existing OpenID assigned to your user profile, and when I try to delete my OpenID I get an error that this isn't possible because no password is set, classic Catch 22 there! Anyway, I'll stick with the OpenID madness for a bit and if I get really fed up with it I may well just create a new COPTR account, probably the easiest solution all around. [[User:Johanvanderknijff|johanvanderknijff]] ([[User talk:Johanvanderknijff|talk]]) 13:14, 21 November 2013 (UTC)<br />
<br />
::::: I'm pretty sure that, if you completely log out first, and if there is an email associated with your account, you can use the 'Password reset' option to get access. [[User:Andy Jackson|Andy Jackson]] ([[User talk:Andy Jackson|talk]]) 15:28, 21 November 2013 (UTC)<br />
<br />
== changing Open Planets Foundation to Open Preservation Foundation ==<br />
<br />
Hi. The Main Page need some update (changing Open Planets Foundation to Open Preservation Foundation). I have of course not the permission to do that :-) . Thanks. [[User:Chlara|Chlara ]] ([[User talk:Chlara|talk]]) 10:18, 27 November 2014 (UTC)<br />
* Good call, this is now fixed. Thanks!<br />
<br />
== DSPS (Digital Preservation Software Platform) is the same as Digital Preservation Software Platform ==<br />
<br />
Hi. I have updated the Digital Preservation Software Platform. After that I saw that there is a nearly same Page DSPS (Digital Preservation Software Platform). Due to the fact, that "DSPS (Digital Preservation Software Platform)" is the better title I have copied the content from Digital Preservation Software Platform into DSPS (Digital Preservation Software Platform). In the Page Digital Preservation Software Platform I replaced the content with a link to DSPS (Digital Preservation Software Platform). <br />
<br />
Can you please delete the page Digital Preservation Software Platform.<br />
<br />
Thanks. [[User:Chlara|Chlara ]] ([[User talk:Chlara|talk]]) 11:30, 8 January 2015 (UTC)<br />
<br />
== Merging duplicate entries ==<br />
<br />
I noticed that there are two entries for the CONTENTdm digital asset management system:<br />
<br />
* http://coptr.digipres.org/ContentDM<br />
* http://coptr.digipres.org/CONTENTdm<br />
<br />
I removed the limited content from the ContentDM page and made it redirect to the CONTENTdm page, using the MediaWiki #redirect option (see [http://www.mediawiki.org/wiki/Help:Redirects http://www.mediawiki.org/wiki/Help:Redirects]. Should this be the recommended practice or would you rather simply delete one of the pages? If the latter, how to we request removal of a page?<br />
<br />
[[User:Danielle Plumer|Danielle Plumer]] ([[User talk:Danielle Plumer|talk]]) 03:00, 5 February 2015 (UTC)<br />
<br />
== Check this entries ==<br />
==== Workflow ====<br />
Hi. I Think the entrie http://coptr.digipres.org/Workflow:Workflow_for_ingesting_digitized_books_into_a_digital_archive need to be changed or deleted. What do you think about it? Thanks. [[User:Chlara|Chlara ]] ([[User talk:Chlara|talk]]) 06:19, 20 April 2017 (UTC)</div>Chlarahttps://coptr.digipres.org/index.php?title=Exact_Audio_Copy&diff=3017Exact Audio Copy2017-04-20T06:08:20Z<p>Chlara: no image, Link deleted</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=Exact Audio Copy is an audio grabber for audio CDs using standard CD and DVD-ROM drives on Windows only. <br />
|image=<br />
|homepage=http://www.exactaudiocopy.de<br />
|license=Proprietary, Freeware<br />
|platforms= Windows XP, Vista, 7, 8, 10. No Linux nor MAC version is planned at all. But it is reported that EAC runs in an emulation layer (WINE for Linux and Virtual PC Win 98 for MAC) <br />
}}<br />
<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:File Format Migration]]<br />
[[Category:Metadata Extraction]]<br />
[[Category:Disk Imaging]]<br />
[[Category:Audio]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Exact Audio Copy (EAC) has three options for copying media: Copy Selected Tracks Uncompressed, Copy Selected Track Compressed, and Copy Image and Cue Sheet.<br />
<br />
In secure mode EAC either reads every audio sector at least twice or relies on extended error information that some drives are able to return with the audio data. By using this technique non-identical sectors are detected. If an error occurs (read or sync error), the program keeps on reading this sector, until eight of 16 retries are identical, but at maximum one, three or five times (according to the selected error recovery quality) these 16 retries are read. So, in the worst case, bad sectors are read up to 82 times. Exact Audio Copy uses several technologies like multi-reading with verify and AccurateRip.<br />
<br />
==== Online Help ====<br />
http://wiki.hydrogenaud.io/index.php?title=Category:EAC_Guides<br />
<br />
http://www.digital-inn.de/forums/exact-audio-copy-english.14/<br />
<br />
==== Owner ====<br />
<br />
The software is owned and managed by Andre Wiethoff.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* http://campuspress.yale.edu/borndigital/2016/12/20/to-image-or-copy-the-compact-disc-digital-audio-dilemma/<br />
* http://journal.code4lib.org/articles/9581<br />
* https://archives.library.illinois.edu/staff-resources/digital-workflow/born-digital-workflow/accessioning/extract-from-source-media/ripping-audio-records-using-exact-audio-copy/<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on http://www.exactaudiocopy.de/en/index.php/resources/whats-new/whats-new/<br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Exact_Audio_Copy&diff=3016Exact Audio Copy2017-04-20T06:04:00Z<p>Chlara: Use the structure provided in this template, do not change it!</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=Exact Audio Copy is an audio grabber for audio CDs using standard CD and DVD-ROM drives on Windows only. <br />
|image={{PAGENAMEE}}.png<br />
|homepage=http://www.exactaudiocopy.de<br />
|license=Proprietary, Freeware<br />
|platforms= Windows XP, Vista, 7, 8, 10. No Linux nor MAC version is planned at all. But it is reported that EAC runs in an emulation layer (WINE for Linux and Virtual PC Win 98 for MAC) <br />
}}<br />
<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:File Format Migration]]<br />
[[Category:Metadata Extraction]]<br />
[[Category:Disk Imaging]]<br />
[[Category:Audio]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Exact Audio Copy (EAC) has three options for copying media: Copy Selected Tracks Uncompressed, Copy Selected Track Compressed, and Copy Image and Cue Sheet.<br />
<br />
In secure mode EAC either reads every audio sector at least twice or relies on extended error information that some drives are able to return with the audio data. By using this technique non-identical sectors are detected. If an error occurs (read or sync error), the program keeps on reading this sector, until eight of 16 retries are identical, but at maximum one, three or five times (according to the selected error recovery quality) these 16 retries are read. So, in the worst case, bad sectors are read up to 82 times. Exact Audio Copy uses several technologies like multi-reading with verify and AccurateRip.<br />
<br />
==== Online Help ====<br />
http://wiki.hydrogenaud.io/index.php?title=Category:EAC_Guides<br />
<br />
http://www.digital-inn.de/forums/exact-audio-copy-english.14/<br />
<br />
==== Owner ====<br />
<br />
The software is owned and managed by Andre Wiethoff.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* http://campuspress.yale.edu/borndigital/2016/12/20/to-image-or-copy-the-compact-disc-digital-audio-dilemma/<br />
* http://journal.code4lib.org/articles/9581<br />
* https://archives.library.illinois.edu/staff-resources/digital-workflow/born-digital-workflow/accessioning/extract-from-source-media/ripping-audio-records-using-exact-audio-copy/<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on http://www.exactaudiocopy.de/en/index.php/resources/whats-new/whats-new/<br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=FIDO_(Format_Identification_for_Digital_Objects)&diff=3003FIDO (Format Identification for Digital Objects)2017-03-27T13:40:46Z<p>Chlara: fido logo</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=A PRONOM based, command line, file format identification tool written in Python<br />
|image=Fido1-75x36.jpg<br />
|homepage=http://www.openpreservation.org/software/fido<br />
|license=Apache 2.0 Open Source License<br />
|platforms=<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Metadata Extraction]]<br />
[[Category:File Format Identification]]<br />
<br />
<br />
= Description =<br />
FIDO (Format Identification for Digital Objects) is a simple format identification CLI tool for digital objects that uses [[DROID (Digital Record Object Identification)#PRONOM|PRONOM]] signatures converted to regular expressions. The functionality of FIDO is similar to [[DROID (Digital Record Object Identification)]] without the GUI.<br />
<br />
FIDO is free, Apache 2.0 licensed, easy to install, and runs on any platform with Python installed. Most importantly, FIDO is very fast.<br />
<br />
FIDO utilizes all available PRONOM signatures to identify digital objects. When an object can not be identified it will try to identify the object based on extension.<br />
<br />
FIDO outputs results in CSV format by default. Available output fields can be formatted on runtime per the requirements of the user.<br />
<br />
FIDO supports custom signatures which are not (yet) available through the PRONOM registry.<br />
<br />
FIDO is able to identify container based (compound) formats such as Office documents and includes functionality to update PRONOM signatures.<br />
<br />
[[FIDOO]] is webpage based service that acts as a simple to use front end to FIDO.<br />
<br />
=== History ===<br />
<br />
FIDO was originally developed in 2010 by Adam Farquhar of British Library. The tool has been adopted by the Open Preservation Foundation in 2011 and is currently maintained by Maurice de Rooij of the National Archives of the Netherlands (NANETH). In October 2011 NANETH has succesfully implemented FIDO as a webservice in the Dutch e-Depot.<br />
<br />
=== Future===<br />
<br />
See [http://wiki.opf-labs.org/display/KB/FIDO+roadmap roadmap]<br />
<br />
=== Contributing===<br />
<br />
Feedback about FIDO is very welcome. Please consult [http://wiki.opf-labs.org/display/KB/Getting%20Started%20with%20the%20OPF Getting Started with the OPF] for more information.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/openpreserve/fido<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/openpreserve/fido/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/openpreserve/fido/commits/master.atom</rss><br />
<br />
.</div>Chlarahttps://coptr.digipres.org/index.php?title=File:Fido1-75x36.jpg&diff=3002File:Fido1-75x36.jpg2017-03-27T13:39:51Z<p>Chlara: fido Logo</p>
<hr />
<div>fido Logo</div>Chlarahttps://coptr.digipres.org/index.php?title=Brunnhilde&diff=2934Brunnhilde2016-12-19T13:22:40Z<p>Chlara: Development Activity</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=A reporting companion to Siegfried<br />
|image=Brunnhilde.png<br />
|homepage=https://github.com/timothyryanwalsh/brunnhilde<br />
|license=MIT License<br />
|platforms=Linux, macOS, OS X<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:Metadata Extraction]]<br />
[[Category:Content Profiling]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Brunnhilde is a command-line utility that runs Siegfried against a specified directory or disk image, loads the results into a sqlite3 database, and queries the database to generate reports to aid in triage, arrangement, and description of digital archives. The program will also check for viruses unless specified otherwise, and will optionally run bulk_extractor against the given source. Reports include CSVs, a tree, and a human-readable HTML summary of the directory or disk image. All outputs are placed into a new directory named after the identifier passed to Brunnhilde as the last argument. Brunnhilde is also capable of exporting files from logical disk images utilizing many file systems, including HFS+.<br />
<br />
Dependencies include Python (tested in 2.7 and 3.5), Siegfried, ClamAV, bulk_extractor, Sleuth Kit, and HFSExplorer. Nearly all dependencies already installed and compiled in Bitcurator.<br />
<br />
For a GUI wrapper for Brunnhilde, see the [https://github.com/timothyryanwalsh/brunnhilde-GUI Brunnhilde GUI Github repo].<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/timothyryanwalsh/brunnhilde/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/timothyryanwalsh/brunnhilde/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/timothyryanwalsh/brunnhilde/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Rescarta&diff=2922Rescarta2016-10-10T07:33:40Z<p>Chlara: Use the structure provided in the template, do not change it!</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.<br />
|image=<br />
|homepage=http://www.ResCarta.org<br />
|license=Apache License v2.0<br />
|platforms=Linux, Windows, and OSX operating systems<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of <br />
<br />
existing categories first (see the Navigation sidebar on the left). The following are common category <br />
<br />
examples, remove those that don't apply --><br />
[[Category:Preservation System]]<br />
[[Category:Access]]<br />
[[Category:Personal Archiving]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and <br />
<br />
view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any <br />
<br />
content type, do not add a category. The following are common category examples, remove those that <br />
<br />
don't apply --><br />
[[Category:Audio]]<br />
[[Category:Document]]<br />
[[Category:Research Data]]<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on its digital preservation value. Keep it factual. --><br />
ResCarta Toolkit allows users to create digital archives from scans, digital photographs or recordings of analog objects. Metadata is added using simple forms and is written into each digital object to Library of Congress Standards of METS, MODS, MIX, and AudioMD. Audio files containing spoken words can be automatically transcribed. Textural content of documents and audio can be edited to create highly accurate transcriptions using graphical tools. Standard directory and file naming is produced during use. A complete Lucene tm index of metadata and textural content can be created for use in a fully functional web application for discovery and display of digital objects. A checksum validation tool is included to assure long term stability of the archive. <br />
<br />
====Platform====<br />
The ResCarta Toolkit runs on Windows, Mac and Linux operating systems in single user or coordinated multiuser mode. <br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the <br />
<br />
effectiveness (or otherwise) of the tool. --><br />
* https://sourceforge.net/projects/rescarta/reviews?source=navbar<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or <br />
<br />
commits. --><br />
Years of archived releases can be found at...<br />
http://sourceforge.net/projects/rescarta/files/ResCarta%20Tools/<br />
<br />
<!-- Add the Ohloh.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|ohloh_id=rescarta<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=File:SIARDexcerpt_FuntionalPrinciple.JPG&diff=2918File:SIARDexcerpt FuntionalPrinciple.JPG2016-09-20T12:55:16Z<p>Chlara: SIARDexcerpt: Funtional Principle v0.0.6</p>
<hr />
<div>SIARDexcerpt: Funtional Principle v0.0.6</div>Chlarahttps://coptr.digipres.org/index.php?title=File:SIARDexcerpt.JPG&diff=2917File:SIARDexcerpt.JPG2016-09-20T12:53:54Z<p>Chlara: Logo SIARDexcerpt</p>
<hr />
<div>Logo SIARDexcerpt</div>Chlarahttps://coptr.digipres.org/index.php?title=SIARDexcerpt&diff=2916SIARDexcerpt2016-09-20T12:49:24Z<p>Chlara: Created page SIARDexcerpt</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=SIARDexcerpt is a Java-based application that searches and extracts individual records of SIARD files.<br />
|image=SIARDexcerpt.JPG<br />
|homepage=http://kost-ceco.ch/cms/index.php?siardexcerpt_de<br />
|license=GPL V3<br />
|platforms= should run under Java 1.6 on Windows<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Access]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Access]]<br />
<br />
= Description =<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
SIARDexcerpt is a Java-based application that searches and extracts individual records of SIARD files, then converts them into a human readable form using a user-specific or a generic stylesheet. It is an open source application under a GPL v3+ licence. SIARDexcerpt uses unmodified components of other manufacturers by embedding them directly into the source code.<br />
<br />
<br />
=== Funtional Principle ===<br />
SIARDexcerpt complies with the following requirements.<br />
<br />
* '''Initialisation:''' During initialisation the SIARD file is unpacked into the working directory and the desired configuration is copied to the predefined location. If required, the configuration is completed automatically according to metadata.xml and temporally saved as SIARDexcerpt.conf.xml.<br />
* '''Search:''' After initialisation the matching lines are searched using grep. The asterisk (*) serves as a wild-card character. SIARDexcerpt copies the matching lines and outputs twelve predefined columns as a preview. A stylesheet permits the display of the result in Internet Explorer. The search result is saved into the configured output folder.<br />
* '''Extraction:''' The extraction can start once the primary key is known. The extracted result is saved into the configured output folder. A stylesheet permits the display of the result in Internet Explorer.<br />
* '''End:''' At the end, the temporary configuration file SIARDexcerpt.conf.xml and the unpacked SIARD file are deleted.<br />
<br />
The results (including information on inconsistencies and errors) are output for every step and written into a validation log. <br />
<br />
[[File:SIARDexcerpt_FuntionalPrinciple.JPG|800px]]<br />
<br />
<br />
=== Third-party applications ===<br />
SIARDexcerpt uses unmodified components of other manufacturers by embedding them directly into the source code. Users of SIARDexcerpt are requested to adhere to these components ‘terms of licence. <br />
<br />
* To search and excerpt SIARDexcerpt use grep. <br />
<br />
<br />
=== Read Me & Download ===<br />
SIARDexcerpt is a Java-based application that searches and extracts individual records of SIARD files, then converts them into a human readable form using a user-specific or a generic stylesheet.<br />
<br />
SIARDexcerpt, Copyright (C) 2016 Claire Roethlisberger (KOST-CECO).<br />
<br />
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see GPL-3.0_COPYING.txt for details.<br />
<br />
You can download SIARDexcerpt under https://github.com/KOST-CECO/SIARDexcerpt/releases or http://kost-ceco.ch/cms/index.php?siardexcerpt_de. For installation-instructions check the [http://github.com/KOST-CECO/SIARDexcerpt/releases manual (DE/FR/EN)].<br />
<br />
<br />
=== SIARD format ===<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification http://www.ech.ch/vechweb/page?p=dossier&documentNumber=eCH-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations.<br />
<br />
<br />
=== Future ===<br />
See http://github.com/KOST-CECO/SIARDexcerpt/issues <br />
<br />
<br />
=== Feedback & Issues ===<br />
Feedback about SIARDexcerpt is very welcome at http://github.com/KOST-CECO/KOST-Simy/issues or kost-val[at]kost-ceco.ch<br />
<br />
.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
.<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/KOST-CECO/SIARDexcerpt/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/KOST-CECO/SIARDexcerpt/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/KOST-CECO/SIARDexcerpt/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=User:Chlara&diff=2915User:Chlara2016-09-20T12:38:15Z<p>Chlara: KOST-Simy & SIARDexcerpt</p>
<hr />
<div>Chlara (alias Claire Röthlisberger) works for the KOST-CECO Agency.<br />
<br />
The Coordination Agency for the Preservation of Electronic Files (KOST-CECO) supports the digital preservation activities of public archives in Switzerland and Liechtenstein. <br />
Its field of action includes, among others, file format analysis, standardization, and preservation tool development.<br />
<br />
Feel free to surf to our webpage: http://kost-ceco.ch<br />
<br />
KOST-CECO Tools: <br />
* [[KOST-Val]]<br />
* [[KOST-Simy]]<br />
* [[SIARDexcerpt]]<br />
* [[CSV2SIARD]]</div>Chlarahttps://coptr.digipres.org/index.php?title=Bad_Peggy&diff=2912Bad Peggy2016-09-06T15:27:43Z<p>Chlara: URL update</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=Scans for damaged images and photos. <br />
|image=BadPeggy.png<br />
|homepage=https://www.coderslagoon.com/index.php?lang=EN<br />
|license=GPLv3<br />
|platforms=Windows, Linux, OSX<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Image]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Image]]<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Bad Peggy scans images ({{Format|JPEG}}, {{Format|PNG}}, {{Format|BMP}}, {{Format|GIF}}) for damages and other blemishes, and shows the results and files instantly. It enables you to find such broken files quickly, inspect and then either delete or move them to a different location. <br />
<br />
Requires Java 6 or higher. <br />
<br />
Lizensed under the GPLv3.<br />
<br />
Quoted from the documentation:<br />
"Bad Beggy uses the Java Image IO (JIIO) library to examine image files. Its decoder emits warnings and errors while an image gets loaded. Thus the results do depend on it being up-to-date and also its changes in functionality. Bad Peggy checks though on startup if in general, well-known errors in images do get detected, i.e. if JIIO is still functioning in detecting damaged images as expected. What "damaged" truly means depends and can be<br />
*small difference from the official format, e.g. extra data appended after the actual image.<br />
*non-critical issues like unknown values, which do not affect displaying the image at all.<br />
*minor damage which only disturb smaller parts of the image.<br />
*major damage, which causes the display to be corrupted after a particular position.<br />
*completely truncated or i.e. incomplete images.<br />
*errors at the beginning of the files, so that decoding can't even commence.<br />
*files with are not images at all, but accidentally carry the file extension.<br />
*image files which don't get recognized b the JIIO, but can be processed by other image viewers, e.g. if additional information is stored before the *image data starts (which smarter or more aggressive decoders then skip).<br />
*an image which looks damaged because it got loaded as such, and saved again in another application - and thus is structurally fine.<br />
*an image which is logically damaged but does not cause complains by the JIIO, although the flaws are clearly visible - this is one of the most problematic cases, since such files won't be detected by Bad Peggy - detection for such problems is difficult, you can compare this with a text editor loading and displaying a file with the word "text pr{cessor" in the, where the 'a' to '{' change was caused by a faulty transmission but the text *still makes sense to the editor itself.<br />
In general it is not recommended to just discard every image reported as damaged but to check out if repairing or re-saving the file in other applications into a generally valid image format is possible."<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* '''KOST-CECO:''' Used in [[KOST-Val]] for the JPEG validation module. [[KOST-Val]] evaluates the error message "Not a JPEG file" further.<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
New in Version 2.0:<br />
* Support for PNG, BMP and GIF images.<br />
* Simplified status bar.<br />
* Visual error differentiation changed to grayscale.<br />
* Error differentiation now in done in gray tones.<br />
* Message box button text is now translated.<br />
* Minor bug fixes and cosmetic changes.<br />
<br />
Bad Peggy Sources: https://www.coderslagoon.com/files/badpeggy20_src.tar.xz<br />
<br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=7-Zip&diff=29117-Zip2016-08-25T07:43:11Z<p>Chlara: Use the structure provided in the template, do not change it!</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=7-Zip is a file archiver with a high compression ratio<br />
|image=<br />
|homepage=http://www.7-zip.org/<br />
|license=7-Zip is open source software [http://www.7-zip.org/]. Most of the source code is under the GNU LPGL license [http://www.7-zip.org/]. You can use 7-Zip on any computer, including a computer in a commercial organization [http://www.7-zip.org/]. You don't need to register or pay for 7-zip [http://www.7-zip.org/].<br />
|platforms=7-zip can be utilized on multiple platforms including Windows 7, Vista, XP, 2008, 2003, 2000, NT, ME, and 98 [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp].<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:File Compression]]<br />
[[Category:Rendering]]<br />
[[Category:Container]]<br />
[[Category:Disk Image]]<br />
<br />
= Description =<br />
7-Zip is a file archiver with a high compression ratio [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp]. You can use 7-zip on any computer, including a computer in a commercial organization [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp]. You don't need to register or pay for 7-zip [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp]. 7-zip works for Windows 7, Vista, XP, 2008, 2003, 2000, NT, ME, and 98 [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp]. And there is a port of the command line version to Linux/Unix [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp]. Most of the source code is under the GNU LGPL license [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp]. The unRar code is under a mixed license with GNU LGPL + unRAR restrictions [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp]. Check the license for details [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp].<br />
<br />
=== Platform ===<br />
7-zip can be utilized on multiple platforms including Windows 7, Vista, XP, 2008, 2003, 2000, NT, ME, and 98 [http://www.sourceforge.net/projects/sevenzip/editorial/?source=psp]<br />
<br />
= User Experiences =<br />
7zip can also be used to open an ISO Image File[http://en.wikipedia.org/wiki/ISO_image].<br />
<br />
= Development Activity =<br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=7-Zip<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=SIARD_Suite&diff=2910SIARD Suite2016-08-25T07:39:27Z<p>Chlara: User Experiences: User = Daniel</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=SIARD Suite is a freeware tool for the conversion of contents of relations databases into the SIARD format. <br />
|image=<br />
|homepage=http://www.bar.admin.ch/dienstleistungen/00823/index.html?lang=en <br />
|license=Vendor-specific licence agreement. Redistribution prohibited<br />
|platforms=should run under Java 1.6 on any operating system<br />
|formats_out={{Format|SIARD}}<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:File Format Migration]]<br />
[[Category:Database]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:File Format Migration]]<br />
[[Category:Database]]<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
SIARD Suite extracts the contents from a relational database and saves them in the SIARD format which is appropriate for archiving. <br />
<br />
In this format, the data is stored for future generations and can be uploaded into a new database that may differ from the original. <br />
<br />
Therefore, this method allows to retain data independently of the original database and to reuse them in the future in modern database systems. <br />
<br />
<br />
==== SIARD format ====<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification http://www.ech.ch/vechweb/page?p=dossier&documentNumber=eCH-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations. <br />
<br />
<br />
==== Which types of databases does SIARD Suite support? ====<br />
* Oracle<br />
* Microsoft SQL Server<br />
* MySQL<br />
* DB/2<br />
* Microsoft Access<br />
<br />
<br />
==== How can I order SIARD Suite? ====<br />
You can order SIARD Suite using the form under http://www.bar.admin.ch/dienstleistungen/00823/00825/index.html?lang=en. <br />
<br />
<br />
==== Intellectual Property Rights ====<br />
The SIARD Suite is a development of the Swiss Federal Archives. All rights rest with the Swiss Federal Archives.<br />
<br />
The SIARD Suite relies on the following component of other manufacturers:<br />
<br />
* JavaHelp 2.0.05 (from http://java.sun.com/products/javahelp, License: in the SIARD folder under doc/JavaHelp LICENSE.HTML)<br />
<br />
To ease the installation process, this component is delivered on the SIARD CD as JAR files.<br />
<br />
Users of the SIARD Suite are requested to honour these components' license requirements, which can be found in the doc folder.<br />
<br />
. <br />
<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. --><br />
* '''Daniel:'''<br />
**This user experience is only about SiardEdit, the GUI version of SIARD Suite. The program works fast and because of the minimalistic user interface and the good documentation is the programm easy to use. <br />
**If SIARD has some problems to create the .siard file from a database, then the program will stop and the user gets the message "not successful". A detailed describtion of the error and the reason of this problems is desirable. A useful error message is only avaiable when the error is detected by the database, then SIARD will print out the error message of the database.<br />
**Searching in a .siard file is not possible in a appropriated way. For that the user have to convert the .siard file in a "normal" database and then the user work and search in the database in the typical way.<br />
**Using admin accounts to download a database will cause errors. The best way is to create a achive account which has reading rights to the part of the database which should be archived.<br />
**For converting a MS Access database in a .siard file, SIARD will not store any information about the primary keys. In SIARD only tables can be stored. Views and something like that can't be stored, thats way it is important to make sure the database can be used and read without any extras.<br />
<br />
.<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<br />
<!-- Add the Ohloh.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=MediaConch&diff=2909MediaConch2016-08-25T07:35:16Z<p>Chlara: User Experiences: User = Daniel</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose= Media Conch is a implementation checker, policy checker and fixer for audiovisual files with focus on Matroska, LPCM and FFV1.<br />
|homepage= https://mediaarea.net/MediaConch/ <br /><br />
[https://github.com/MediaArea/MediaConch GitHub Repository]<br />
|license= GNU General Public License 3.0 (GPLv3 or later), Mozilla Public License (MPLv2 or later).<br />
|platforms= Windows 7 or later, MAC OS 10.5 or later, Linux/Unix, Web Application<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:Policy]]<br />
[[Category:Validation]]<br />
[[Category:File Format Identification]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
[[Category: Video]]<br />
[[Category: Audio]]<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
MediaConch is an extensible, open source software project consisting of an implementation checker, policy checker, reporter, and fixer that targets preservation-level audiovisual files (specifically Matroska, Linear Pulse Code Modulation (LPCM) and FF Video Codec 1 (FFV1)) for use in memory institutions, providing detailed and batch-level conformance checking via an adaptable and flexible application program interface accessible by the command line, a graphical user interface, or a web-based shell.<br />
Media Conch also provides a tool for creating own policies.<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* '''Daniel:'''<br />
** Media Conch runs fast and the output of the implementation and policy checker is not extensive but consits of useful information.<br />
** The tool for creating own policies provides a lot of options and cases but the usage of this tool is complex and presupposed detailed knowledge of the different file formats.<br />
** The documentation for build Media Conch from source code under Windows is more than bad, because documentation it simply indicates to use Visual Studio.<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/MediaArea/MediaConch/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/MediaArea/MediaConch/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/MediaArea/MediaConch/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=MediaConch&diff=2908MediaConch2016-08-25T07:31:48Z<p>Chlara: Development Activity: link to Github</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose= Media Conch is a implementation checker, policy checker and fixer for audiovisual files with focus on Matroska, LPCM and FFV1.<br />
|homepage= https://mediaarea.net/MediaConch/ <br /><br />
[https://github.com/MediaArea/MediaConch GitHub Repository]<br />
|license= GNU General Public License 3.0 (GPLv3 or later), Mozilla Public License (MPLv2 or later).<br />
|platforms= Windows 7 or later, MAC OS 10.5 or later, Linux/Unix, Web Application<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:Policy]]<br />
[[Category:Validation]]<br />
[[Category:File Format Identification]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
[[Category: Video]]<br />
[[Category: Audio]]<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
MediaConch is an extensible, open source software project consisting of an implementation checker, policy checker, reporter, and fixer that targets preservation-level audiovisual files (specifically Matroska, Linear Pulse Code Modulation (LPCM) and FF Video Codec 1 (FFV1)) for use in memory institutions, providing detailed and batch-level conformance checking via an adaptable and flexible application program interface accessible by the command line, a graphical user interface, or a web-based shell.<br />
Media Conch also provides a tool for creating own policies.<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* Media Conch runs fast and the output of the implementation and policy checker is not extensive but consits of useful information.<br />
* The tool for creating own policies provides a lot of options and cases but the usage of this tool is complex and presupposed detailed knowledge of the different file formats.<br />
* The documentation for build Media Conch from source code under Windows is more than bad, because documentation it simply indicates to use Visual Studio.<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/MediaArea/MediaConch/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/MediaArea/MediaConch/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/MediaArea/MediaConch/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Glacier_(Amazon)&diff=2904Glacier (Amazon)2016-08-08T10:34:40Z<p>Chlara: no image</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
{{Infobox_tool<br />
|purpose=Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup.<br />
|image=<br />
|homepage=https://aws.amazon.com/glacier/<br />
|license=<br />
|platforms=<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Storage]]<br />
[[Category:Service]]<br />
[[Category:Backup]]<br />
<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup. Customers can reliably store large or small amounts of data for a significant savings compared to on-premises solutions. To keep costs low, Amazon Glacier is optimized for infrequently accessed data where a retrieval time of several hours is suitable.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=Glacier (Amazon)<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Glacier_(Amazon)&diff=2903Glacier (Amazon)2016-08-08T10:33:29Z<p>Chlara: Use the structure provided in the template, do not change it!</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
{{Infobox_tool<br />
|purpose=Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup.<br />
|image=Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup<br />
|homepage=https://aws.amazon.com/glacier/<br />
|license=<br />
|platforms=<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Storage]]<br />
[[Category:Service]]<br />
[[Category:Backup]]<br />
<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup. Customers can reliably store large or small amounts of data for a significant savings compared to on-premises solutions. To keep costs low, Amazon Glacier is optimized for infrequently accessed data where a retrieval time of several hours is suitable.<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=Glacier (Amazon)<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=ArchivesSpace&diff=2902ArchivesSpace2016-08-08T10:25:00Z<p>Chlara: Description and Development Activity</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=ArchivesSpace is the next-generation web-based archives information management system, designed by archivists and supported by diverse archival repositories. <br />
|image=<br />
|homepage=http://www.archivesspace.org/<br />
|license=Educational Community License, version 2.0.<br />
|platforms=<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as --><br />
[[Category:Access]]<br />
[[Category:Metadata Processing]]<br />
<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
ArchivesSpace is an open source, web application for managing archives information. The application is designed to support core functions in archives administration such as accessioning; description and arrangement of processed materials including analog, hybrid, and born-digital content; management of authorities (agents and subjects) and rights; and reference service. The application supports collection management through collection management records, tracking of events, and a growing number of administrative reports. The application also functions as a metadata authoring tool, enabling the generation of EAD, MARCXML, MODS, Dublin Core, and METS formatted data.<br />
<br />
ArchivesSpace dates from June 2009 when representatives of New York University, the University of Illinois Urbana-Champaign, the University of California San Diego, and the Andrew W. Mellon Foundation agreed to integrate the Archivists' Toolkit and Archon into a single application in order to increase overall functionality within a single application and to optimize sustainability of the application. The Andrew W. Mellon Foundation provided generous funding for the first two phases of the ArchivesSpace program. A planning phase ran from January 2010 to June 2011, during which functional specifications were drafted, a business plan was formulated, and an organizational home was secured. A development phase ran from July 2011 to September 2013, culminating in the release of ArchivesSpace 1.0 on September 30, 2013. There have been nine releases of ArchivesSpace since the 1.0 release, all of which have served to both advance the integration of the Archivists' Toolkit and Archon and to introduce new functionality not present in either of those precursor applications. Source: http://archivesspace.org/overview<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/archivesspace/archivesspace/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/archivesspace/archivesspace/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/archivesspace/archivesspace/commits/master.atom</rss><br />
<br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Demystify&diff=2886Demystify2016-07-12T06:49:59Z<p>Chlara: Development Activity</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=Analysis and automatic generation of summary information from DROID output<br />
|image=<br />
|homepage=https://github.com/ross-spencer/droid-sqlite-analysis<br />
|license=Open source (see URL above)<br />
|platforms=sqlite + Python + text/html<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Metadata Extraction]]<br />
[[Category:Content Profiling]]<br />
[[Category:De-Duplication]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
<br />
== Description ==<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
Engine for analysis of [https://github.com/digital-preservation/droid DROID] CSV export files, [https://github.com/richardlehane/siegfried Siegfried] YAML export files, and Siegfried 'DROID compatible' output. The tool has three purposes, break the exports into their components and store them within a table in a SQLite database; create additional columns to augment the output where useful; and query the SQLite database, outputting results in a readable form useful for analysis by researchers and archivists within digital preservation departments in memory institutions.<br />
<br />
The tool provides archivist definitions for each of the sections output; these definitions are customisable. The tool also supports output of statistics about files that may require further triage or may not be appropriate for long-term preservation based on institutional rules, in the form of a blacklist. The tool also analyses file names and directory names for non-ascii characters, and also characteristics that may present problems cross-file-system based on known Microsoft rules: http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx<br />
<br />
The engine can be used to generate a list of file paths for files that may present digital preservation risks (Rogues) or files which on the surface i.e. via identification alone, look okay (Heroes) and these listings can be used in conjunction with [http://manpages.ubuntu.com/manpages/trusty/man1/rsync.1.html rsync] to isolate these sets from one-another to be more flexible to work with. <br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. --><br />
*Blog entries from the tool author, Ross Spencer:<br />
**'''[2014-06-03]''' [http://www.openplanetsfoundation.org/blogs/2014-06-03-analysis-engine-droid-csv-export Describing the creation and purpose of the tool.]<br />
**'''[2015-08-25]''' [http://openpreservation.org/blog/2015/08/25/hero-or-villain-a-tool-to-create-a-digital-preservation-rogues-gallery/ Using the output of the tool to create a digital preservation rogues gallery.]<br />
**'''[2016-05-23]''' [http://openpreservation.org/blog/2016/05/23/whats-in-a-namespace-the-marriage-of-droid-and-siegfried-analysis/ The integration of Siegfried output for consistent and repeatable reporting.]<br />
**'''[2016-05-24]''' [http://openpreservation.org/blog/2016/05/24/while-were-on-the-subject-a-few-more-points-of-interest-about-the-siegfrieddroid-analysis-tool/ Creating a multi-lingual consistent, digital preservation dialect and exploring alternative methods of format identification using Siegfried's capabilities.]<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/ross-spencer/droid-sqlite-analysis/commits<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/ross-spencer/droid-sqlite-analysis/releases.atom</rss><br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/ross-spencer/droid-sqlite-analysis/commits/master.atom</rss><br />
<br />
<!-- Add the Ohloh.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=FITS_(File_Information_Tool_Set)&diff=2885FITS (File Information Tool Set)2016-07-12T06:43:37Z<p>Chlara: Development Activity: Release Feed</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.<br />
|image=<br />
|homepage=http://fitstool.org<br />
|license=[http://www.gnu.org/licenses/lgpl.html GNU Lesser General Public License]<br />
|platforms=Windows or Unix<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:File Format Identification]]<br />
[[Category:Validation]]<br />
[[Category:Metadata Extraction]]<br />
[[Category:Encryption Detection]]<br />
<br />
= Description =<br />
[http://fitstool.org FITS] allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository. It does this by encorporating a range of mostly third-party open source tools, normalising and consolidating their output.<br />
<br />
=== Provider ===<br />
Harvard Library<br />
<br />
=== Platform and interoperability ===<br />
FITS is written in Java and is compatible with Java 1.6 or higher. <br />
It uses six external tools: <br />
* [[JHOVE (Harvard Object Validation Environment)| JHOVE]]<br />
* [[ExifTool]]<br />
* [[Metadata Extraction Tool]]<br />
* [[DROID_(Digital_Record_Object_Identification)|DROID]]<br />
* [http://web.archive.org/web/20061106114156/http://schmidt.devlib.org/ffident/index.html FFIdent]<br />
* [http://unixhelp.ed.ac.uk/CGI/man-cgi?file File Utility]<br />
<br />
A few Harvard Library-created tools; and many open source libraries.<br />
Instructions for command line use are given for Windows and Unix.<br />
<br />
=== Functional notes ===<br />
FITS acts as a wrapper, invoking and managing the output from several other open source tools. Output from these tools are converted into a common format, compared to one another and consolidated into a single XML output file. Technical metadata is only output (and a part of the consolidation process) for tools that were able to identify the file. All other output is discarded.<br />
<br />
=== Documentation and user support ===<br />
Documentation exists in the form of a user manual and more technical developer manual. <br />
The project actively uses the fits-users google group has 30 members, and is active as of January 2012. <br />
The FITS web site links to a [https://github.com/harvard-lts/fits github site] that includes the source code and an issues tracker.<br />
<br />
=== Usability ===<br />
FITS uses a command line interface; it is designed to be integrated into other software workflows, and so is aimed at those with application design experience.<br />
<br />
=== Expertise required ===<br />
Installation and configuration require deep systems administration and application design knowledge, as well as familiarity with file format and metadata standards.<br />
<br />
=== Standards compliance ===<br />
FITS outputs in XML format. A detailed description of the FITS-XML can be found [http://projects.iq.harvard.edu/fits/fits-xml here] and an analysis of the output data [http://projects.iq.harvard.edu/fits/understanding-output here].<br />
<br />
=== Influence and take-up ===<br />
The FITS website shows over 2000 downloads of the software. <br />
The tool was designed for and is in use at the Harvard Library [http://hul.harvard.edu/ois/systems/drs/ Digital Repository Service].<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* '''DNB (German National Library):'''<br />
** FITS v0.6.1 with modified [[JHOVE (Harvard Object Validation Environment)| Jhove 1.11]], Gentoo Linux, Java-Environment, Tomcat-Application Server<br />
** Since the end of 2012, DNB uses the FITS library as a part of its risk management within the automated ingest process. At present more than 1500 files are daily examined by FITS.<br />
*** The purpose of the risk management and its implementation with metadata tools like FITS or JHOVE is to facilitate automatic technical quality checking (bitstream integrity and validation) of each digital publications. Furthermore, the analysis is aimed at recognising technical restrictions such as DRM at an early stage, which hinder or even prevent the task of long-term preservation and use of the digital objects.<br />
*** The extracted technical metadata (the FITS output) are used further for future long-term preservation measures such as format migration and are stored and managed in the metadata management of the long-term archive of the DNB. The capture of these metadata is essential in order to execute targeted migration measures of files in endangered formats.<br />
*** FITS also offers significant benefit in the form of easily configurable standardisation of the different tool outputs into the FITS format using XSLT. The DNB has used this function to adapt the FITS output to its own requirements, e.g. incorporating other metadata elements not included in the FITS distribution into the standardisation. <br />
*** A further adjustment, which the DNB has made, is the integration of a DNB tool to analyse files in ePub format.<br />
<br />
* '''ZBW (German National Library of Economics):''' [https://wiki.dnb.de/display/NESTOR/ZBW+user+experience+with+FITS Link to the user experience of the ZBW]<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
FITS 0.2.0 was first released as open source in July 2009. As of April 2014 the latest release was version 0.8, released in January 2014. The tool was created to be used in Harvard's Digital Repository Service, and development is active and ongoing.<br />
<br />
All development activity is visible on GitHub: http://github.com/harvard-lts/fits/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/harvard-lts/fits/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/harvard-lts/fits/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=fits<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=PET_(PERICLES_Extraction_Tool)&diff=2882PET (PERICLES Extraction Tool)2016-05-31T11:49:56Z<p>Chlara: User Experiences : bullet list ans User Wittmann</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=A tool to capture contextual information in a sheer curation scenario<br />
|homepage=https://github.com/pericles-project/pet<br />
|license=Apache 2<br />
|platforms=Cross-platform<br />
|language=Java<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<!-- Add one or more categories to describe the function of the tool, such as:<br />
[[Category:Metadata Extraction]] or [[Category:Preservation System]] or [[Category:Backup]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left) --><br />
[[Category:Dependency Analysis]]<br />
[[Category:Metadata Extraction]]<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses, such as:<br />
[[Category:Audio]] or [[Category:Document]] or [[Category:Research Data]]<br />
Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. --><br />
<br />
<br />
== Description ==<br />
The PERICLES Extraction Tool (PET) is an open source (Apache 2 licensed) Java software for the extraction of significant information from the environment where digital objects are created and modified. This information supports object use and reuse, e.g. for a better long-term preservation of data. For the main part of the metadata extraction PET uses [http://coptr.digipres.org/Tika Apache TIKA] and some other moduls which are:<br />
<br />
* CPU specification snapshot<br />
* CPU usage monitoring<br />
* Calculate file checksum<br />
* Create custom executable command (file dependent)<br />
* Create custom executable command (file independent)<br />
* Directory Monitor Module<br />
* FQDN<br />
* File identification<br />
* File store information (java.nio.file)<br />
* File store information (sigar)<br />
* File system information snapshot<br />
* Google chrome opened tabs monitoring<br />
* Graphic System properties snapshot<br />
* Graphic card information module<br />
* Installed software snapshot<br />
* Java installation information snapshot<br />
* LSOF use monitor<br />
* List of network interfaces<br />
* Log expression grep<br />
* [[MediaInfo]]<br />
* Memory monitoring<br />
* Network information<br />
* OS X Spotlight Command module<br />
* Office document dependencies<br />
* Operating System properties snapshot<br />
* PDF Font dependencies<br />
* Posix file information monitoring<br />
* Process parameter<br />
* Process statistics monitoring<br />
* Regex text search<br />
* Screenshot module<br />
* System resources snapshot<br />
* System swap monitoring<br />
* TCP statistics monitoring<br />
* Uptime<br />
* Who (user, host, device, time)<br />
* Windows Handle monitoring daemon<br />
* XML xPath expression <br />
<br />
The Tool was developed entirely for the PERICLES EU project http://www.pericles-project.eu/ by Fabio Corubolo, University of Liverpool, and Anna Eggers, Göttingen State and University Library.<br />
<br />
A more detailed description can be found in this [http://pericles-project.eu/blog/post/metadata%20extraction,%20environment%20information blog post].<br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
*'''User Wittmann:''' PET is a easy to use and easy to install tool, which does all thing which are prommised. The user interface is self explanatory and the programm runs fast. But the neccessary informations for long time storage are given by Apache TIKA. If the other moduls of PET not expressly needed, then it is better and faster to use Apache TIKA directly instead of the PERICLES Extraction Tool.<br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
{{Infobox_tool_details<br />
|releases_rss=https://github.com/pericles-project/pet/releases.atom<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Wish_list_of_tools_to_add_to_COPTR&diff=2872Wish list of tools to add to COPTR2016-04-26T07:24:22Z<p>Chlara: strike verapdf</p>
<hr />
<div>This is a scratch space for adding lists of tools that should be added to COPTR. Please add your wish list of tools below. If anyone subsequently adds tool entries for tools listed here, please indicate that by adding strike tags around them on this page!<br />
<br />
==Tools to add==<br />
*Forensics tools wish list from Cal Lee (more to come):<br />
**Guymager, fiwalk, bulk_extractor, BitCurator environment<br />
*Other misc tools to add:<br />
**<strike>Siegfried [http://www.openplanetsfoundation.org/blogs/2014-09-27-siegfried-pronom-based-file-format-identification-tool]</strike><br />
**Two tools to add here: http://www.itforarchivists.com/<br />
**Web recorder https://webrecorder.io/<br />
**http://sox.sourceforge.net/<br />
**http://www.mplayerhq.hu/<br />
**https://hashcat.net/oclhashcat/ and https://hashcat.net/forum/thread-3818.html<br />
**http://www.videolan.org/<br />
**http://www.squared5.com/<br />
**http://docrefinery.org/<br />
**http://disktype.sourceforge.net/<br />
**<strike>https://github.com/pericles-project/pet and http://pericles-project.eu/blog/post/metadata%20extraction,%20environment%20information/&utm_source=dlvr.it&utm_medium=twitter</strike><br />
**http://lifehacker.com/5992991/infogram-generates-beautiful-infographics-from-custom-data<br />
**http://mpc-hc.org/ Referenced briefly here: https://groups.google.com/d/msg/digital-curation/k-B7g80lOfQ/45oXNw-CUrkJ<br />
**http://viewshare.org/ and http://blogs.loc.gov/digitalpreservation/2014/10/new-season-new-viewshare/<br />
**There are a number of tools listed at the bottom of this page: http://wiki.dpconline.org/index.php?title=Institutional_readiness and this page: http://wiki.dpconline.org/index.php?title=Digital_preservation_risks that should be added to COPTR in the [[:Category:Organisational Audit|Organisational Audit category]]<br />
**There are a number of tools listed at the bottom of this page: http://wiki.dpconline.org/index.php?title=Benefits that should be added to COPTR in the [[:Category:Benefits|Benefits category]]<br />
**Lots of tools listed here http://wiki.opf-labs.org/display/CDP/Home that could be added in the [[:Category:Costs|Costs category]]. There is probably also a lot of overlap with this list: http://4cproject.eu/community-resources/related-projects<br />
**https://github.com/F-Secure/Sulo<br />
**<strike>http://verapdf.org/home/</strike><br />
*DOCX and other document related stuff:<br />
**http://www.documentliberation.org/projects/<br />
**https://www.openhub.net/p/officeotron (no longer maintained as of 2011)<br />
**http://ooxmlvalidator.codeplex.com/ (no longer maintained as of 2009)<br />
**http://www.microsoft.com/en-us/download/details.aspx?id=5124 (Microsoft SDK includes validation support), also see http://blogs.msdn.com/b/ericwhite/archive/2010/03/04/validate-open-xml-documents-using-the-open-xml-sdk-2-0.aspx<br />
**http://packageexplorer.codeplex.com/<br />
**http://johnmacfarlane.net/pandoc/index.html<br />
**https://python-docx.readthedocs.org/en/latest/<br />
**http://textract.readthedocs.org/en/latest/#<br />
*More tools....<br />
**http://minezy.org/<br />
**http://www.avalonmediasystem.org/project<br />
<br />
==Stubs to fix, expand or remove==<br />
*This seems to reference different tools so should be expanded into a number of tool entries http://coptr.digipres.org/Windows_IR/CF_Tools<br />
<br />
==User experience links to add==<br />
*[http://www.mnhs.org/preserve/records/tools.php http://www.mnhs.org/preserve/records/tools.php]<br />
*[https://www.lib.umn.edu/dp/guides https://www.lib.umn.edu/dp/guides]</div>Chlarahttps://coptr.digipres.org/index.php?title=VeraPDF&diff=2871VeraPDF2016-04-26T07:20:34Z<p>Chlara: Quoting from the website, Logo, ...</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose= PDF/A validation tool<br />
|image=VeraPDF.png<br />
|homepage= http://www.verapdf.org<br />
|license= [http://www.gnu.org/licenses/quick-guide-gplv3.html GPL v3+], [https://www.mozilla.org/en-US/MPL/2.0/ MPL v2+]<br />
|platforms= Windows, Mac, Linux<br />
|formats_in={{Format|PDF}}<br />
}}<br />
<!-- Note that to use the image field, you should leave the value as {{PAGENAMEE}}.png (or similar) and upload a copy of the image. Hot-linking is not supported. If you don't want an image, just remove that line. --><br />
<br />
<br />
[[Category:Validation]]<br />
[[Category:Document]]<br />
<br />
<br />
<br />
== Description ==<br />
veraPDF is an open source PDF/A validation tool for Windows, Mac and Linux. The software is still in development, wich is the reason why currently only PDF/A3b, PDF/A2b, PDF/A1b and PDF/A1a can be validated.<br />
Results are exported using either the HTML or XML format.<br />
<br />
<br />
Quoting from the [http://www.verapdf.org website]: "<br />
<br />
<br />
====About veraPDF====<br />
Designed to meet the needs of digital preservationists, and supported by leading members of the PDF software developer community, veraPDF is a purpose-built, open source, permissively licensed file-format validator covering all PDF/A parts and conformance levels.<br />
<br />
<br />
'''The veraPDF consortium'''<br />
<br />
Led by the [http://openpreservation.org/ Open Preservation Foundation (OPF)] and the [http://www.pdfa.org/ PDF Association], the Consortium’s mission is to develop the definitive, open-source validator for PDF/A, and to build a community to maintain the project in the long term.<br />
<br />
<br />
'''Funded by the European Commission’s PREFORMA Project'''<br />
<br />
veraPDF is funded by the [http://www.preforma-project.eu/ PREFORMA] project. PREFORMA – PREservation FORMAts for culture information/e-archives, is a Pre-Commercial Procurement (PCP) project co-funded by the European Commission under its FP7-ICT Programme. The project’s main aim is to address the challenge of implementing standardised file formats for preserving digital objects in the long term, giving memory institutions full control over the acceptance and management of preservation files into digital repositories.<br />
<br />
<br />
====PDF/A validation====<br />
veraPDF logoThe specification for PDF/A is a set of restrictions and requirements applied to the “base” PDF standards (PDF 1.4 for PDF/A-1 and ISO 32000 for PDF/A-2 and PDF/A-3) plus a specific set of 3rd party standards. The veraPDF subsystems include:<br />
<br />
<br />
'''veraPDF Implementation Checker'''<br />
<br />
The Implementation Checker parses and analyzes PDF documents. It outputs two types of report: a report describing the PDF document and its metadata and a Validation Report describing conformance to PDF/A flavours.<br />
<br />
<br />
'''veraPDF Metadata Fixer'''<br />
<br />
The Metadata Fixer makes a limited set of fixes to metadata within PDF documents, such as removal of the PDF/A flag in the case of a non-conforming document, or the repair of broken XMP metadata, if bad XMP is the only error preventing a legitimate PDF/A flag.. The Metadata Fixer produces a fixed version of the original document and a Metadata Fixing Report, which describes the fixes attempted, and their success or failure.<br />
<br />
<br />
'''veraPDF Policy Checker'''<br />
<br />
The Policy Checker parses and analyzes a PDF Features Report and generates a Policy Report stating whether the PDF document complies with institutional policy as expressed in a Policy Profile. Note that the Policy Checker can be used to check for almost any quality in a PDF; for example, the use of annotations, irrespective of PDF/A.<br />
<br />
<br />
'''veraPDF Reporter'''<br />
<br />
The Reporter transforms verPDF’s machine-readable reports as generated by the Implementation Checker, Policy Checker, and Metadata Fixer, into other forms for downstream use.<br />
<br />
<br />
'''veraPDF Shell'''<br />
<br />
The Shell manages veraPDF’s other components and ensures interaction in a coordinated sequences of actions. Users interact with the Shell through the Command Line Interface (CLI), Desktop Graphical User Interface, or Web Graphical User Interface.<br />
<br />
<br />
====Open Licensing====<br />
veraPDF is open source software dual licensed for sustainability and reuse in accordance with PREFORMA’s requirements. Other project outputs such as test corpora and documentation are issued under a Creative Commons license.<br />
<br />
<br />
'''[https://www.mozilla.org/en-US/MPL/2.0/ MPL v2+]'''<br />
<br />
The Mozilla Public License v2+ allows covered source code to be mixed with other files under a different, even proprietary license. Code licensed under the MPL must remain under the MPL, and freely available in source form.<br />
<br />
<br />
'''[http://www.gnu.org/licenses/quick-guide-gplv3.html GPL v3+]'''<br />
<br />
The GNU General Public License v3 guarantees users the freedom to run, study, share (copy), and modify the software. The copyleft quality of the GPLv3 requires those rights to be retained.<br />
<br />
<br />
'''[https://creativecommons.org/licenses/by/4.0/ CC-BY-4.0]'''<br />
<br />
The Creative Commons is a public copyright license that enable the free distribution of an otherwise copyrighted work.<br />
<br />
" <br />
<br />
== User Experiences ==<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
<br />
== Development Activity ==<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
<!-- Add the OpenHub.com ID for the tool, if known. --><br />
All development activity is visible on GitHub: https://github.com/verapdf<br />
<br />
{{Infobox_tool_details<br />
|releases_rss=http://verapdf.org/feed/<br />
|issues_rss=<br />
|mailing_lists=<br />
|ohloh_id=<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=File:VeraPDF.png&diff=2870File:VeraPDF.png2016-04-26T06:51:28Z<p>Chlara: </p>
<hr />
<div></div>Chlarahttps://coptr.digipres.org/index.php?title=Archivematica&diff=2866Archivematica2016-04-07T11:37:17Z<p>Chlara: removed License and Development Activity from Description</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=Archivematica is a digital preservation system that automates the process of preparing digital objects for ingest into a repository and an access system<br />
|image=<br />
|homepage=https://www.archivematica.org<br />
|license=[http://www.gnu.org/licenses/agpl.html AGPL version 3]<br />
|platforms=<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Preservation System]]<br />
[[Category:File Format Migration]]<br />
<br />
= Description =<br />
Archivematica is a digital preservation system that automates the process of preparing digital objects for ingest into a repository and an access system, ingesting them into archival storage and providing access to the archived material as well as uploading access copies to an access system. The process is monitored and controlled through a Web-based dashboard that co-ordinates a suite of micro-services. It relies on normalisation with preservation as the original object and comprehensive PREMIS metadata in METS.xml as its primary preservation technique.<br />
====Provider====<br />
This project is managed by [http://artefactual.com/archivematica.html Artefactual Systems]. It began in collaboration with the UNESCO Memory of the World&#39;s [http://goo.gl/tlRHbk Subcommittee on Technology] and the [http://vancouver.ca/ctyclerk/archives/ City of Vancouver Archives], but continues active development along with its partners at the [http://diginit.library.ubc.ca/ University of British Columbia Library], the [http://rockarch.org/ Rockefeller Archive Center], [http://www.sfu.ca/archives/ Simon Fraser University Archives and Records Management], [http://library.bentley.edu/ Bentley Historical Library] and a number of other collaborators.<br />
<br />
====Platform and interoperability====<br />
Archivematica may be installed directly on a Linux system, and specifically targets Long Term Support releases of the Xubuntu operating system.<br />
Support is included for using Archivematica as a preservation backend for DSpace, a front end for CONTENTdm access, a front end for LOCKSS storage, and a front end for access using [https://www.accesstomemory.org/en/ AtoM], which comes bundled with the software.<br />
====Functional notes====<br />
Archivematica uses a micro-services approach, which means it acts as a wrapper for many task-specific applications such as the BagIt library, Clam Anti-Virus, DigiKam, FFmpeg, FITS (File Information Tool Set), ImageMagick, Inkscape, OpenOffice.org, and 7-Zip.<br />
The typical workflow is for the curator to assemble a transfer package in the filesystem: a script is provided for setting up the right folder structure or the structure can be assembled manually for some workflows, then digital objects are added to one folder and contextual information (submission documentation in the form of e.g. transfer forms, donation agreements) to another. The package is moved to an input folder &#39;watched&#39; by the main Archivematica Web tool. Through the Web interface, the curator can decide to accept or reject the transfer. If the transfer is accepted, the tool performs an initial analysis &ndash; calculating checksums, assigning UUIDs, scanning for viruses, identifying formats, extracting metadata &ndash; and then offers to create a Submission Information Package (SIP); it is also possible to create one or more SIPs manually. Metadata (simple Dublin Core and PREMIS 2.2 rights/restrictions) can then be added to the SIP before it is ingested. At ingest, the curator can choose various routes such as Preservation (where the digital objects are normalised to archival formats and transformed into an Archival Information Package, or AIP), Access (where the digital objects are normalised to dissemination formats and transformed into a Dissemination Information Package, or DIP), repackaging without normalisation, or many combinations of the aforementioned. Further functions are provided for moving AIPs into archival storage and uploading DIPs to AtoM or another access portal. Workflows and decision points are configurable via preconfiguration settings in the administration tab of the web-based dashboard.<br />
====Documentation and user support====<br />
The online [https://www.archivematica.org/en/ documentation] for Archivematica includes a User and an Administrative Manual; the [https://wiki.archivematica.org/Main_Page project wiki] provides, screencasts, requirements specifications (including use cases, activity diagrams, recognised significant properties of various media and media preservation plans) and a description of the technical architecture.<br />
Community support is available through the [http://groups.google.com/group/archivematica Archivematica Discussion Group]. Artefactual Systems, Inc., the primary developer of Archivematica, also offers [https://www.artefactual.com/services/ support options].<br />
<br />
====Usability====<br />
The majority of operations are accomplished through a simple Web-based graphical user interface. Reports on the ease of installation and the robustness of the system are mixed but improving; see for example the experiences of [http://larchivista.blogspot.co.uk/2011/04/installing-archivematica.html Bonnie Weddle] and [http://e-records.chrisprom.com/evaluating-open-source-digital-preservation-systems-a-case-study-2/ Angela Jordan] with version 0.7, and [http://digital-archiving.blogspot.co.uk/2012/09/installing-archivematica-and-running.html Jenny Mitcham] with version 0.9.<br />
====Expertise required====<br />
The system is easy to use, though as it draws heavily on the [http://www.dcc.ac.uk/resources/briefing-papers/introduction-curation/using-oais-curation OAIS Reference Model] some familiarity with that model is needed to understand the workflows Archivematica supports. When installing directly on a Linux desktop or server, even if it is deployed in a virtual machine, a little technical expertise is required (e.g. for setting up ports correctly).<br />
<br />
====Standards compliance====<br />
The functionality of Archivematica is clearly based on that defined by the OAIS Reference Model. The Archival Information Packages generated by the system use the BagIt packaging format, in conjunction with a METS packaging manifest incorporating PREMIS metadata.<br />
====Influence and take-up====<br />
Archivematica is used by at least [https://www.archivematica.org/wiki/Community 30 organisations].<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/artefactual/archivematica/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/artefactual/archivematica/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/artefactual/archivematica/commits/stable/1.4.x.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=Archivematica<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=NetarchiveSuite&diff=2865NetarchiveSuite2016-04-07T11:33:00Z<p>Chlara: removed License and Development Activity from Description</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=NetarchiveSuite is a web archiving software package designed to plan, schedule and run web harvests of parts of the Internet.<br />
|image=NAS.gif<br />
|homepage=http://netarchive.dk/suite/Welcome<br />
|license=[http://www.gnu.org/licenses/lgpl.html#translations GNU Lesser General Public License]<br />
|platforms=Linux<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Web Crawl]]<br />
[[Category:Web]]<br />
<br />
<br />
= Description =<br />
The [https://sbforge.org/display/NAS/NetarchiveSuite NetarchiveSuite] is a web archiving software package designed to plan, schedule and run web harvests of parts of the Internet. It is also serves as an archiving platform for the collected materials, storing and performing consistency checks of the data.<br />
<br />
====Provider====<br />
The Royal Library of Denmark and the State and University Library of Denmark<br />
====Platform and interoperability====<br />
Netarchive requires a computer running a Linux operating system with Sun Java 1.6, as well as Java Messaging Service. The software uses Heritrix as its web crawler.<br />
====Functional notes====<br />
The NetarchiveSuite is split into four main modules: three modules corresponding to processes of harvesting, archiving and accessing materials, and one module to coordinate functions. The Harvester module can organise both snapshot and recurring harvests; it supports packaging metadata about the harvest together with the harvested data. The Archive module advertises bit consistency checks and the ability to support distributed batch jobs; it also supports storage. The Access Module uses a proxy solution to give access to the material. As the software uses Heritrix for its crawls, the materials collected take the form of arc files (not to be confused with ARC files).<br />
====Documentation and user support====<br />
Netarchive&rsquo;s website includes extensive [https://sbforge.org/display/NAS/Documentation documentation], including an Overview and Quick Start Manual. Detailed guidance is found in the Configuration, Installation, and User Manuals. Developer guidance is found in the System Design and Additional Tools Manual. The website also points to a new Wiki that as of writing is in some places incomplete, and in general rather difficult to navigate. The project supports four mailing lists, all of which are currently active: -announce; -curator; -devel; and -users. The site also includes a contact page with email addresses for individuals at KB, SB, BnF, and ONB.<br />
====Usability====<br />
The Suite advertises its ability to be used by librarians rather than systems administrators; its Quickstart installation option is designed to take an hour, and the GUI for ongoing use is extremely simple.<br />
====Expertise required====<br />
With any web archiving project, deep understanding of the project&rsquo;s scope and collections policy is essential in order to set up appropriate targets.<br />
====Standards compliance====<br />
No standards compliance is obviously advertised.<br />
====Influence and take-up====<br />
The Royal Library of Denmark and the State and University Library have used NetarchiveSuite to harvest the Danish world wide web since 2005. The software is also used by the Bibliothèque nationale de France, and Österreichische Nationalbibliothek.<br />
<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
Version 4.1 was released 27th May 2013. Version 4.2 should be released at the end of June 2013.<br />
The software appears to be an integral part of the Libraries&rsquo; ongoing web archiving effort, indicating continuing support. The website includes a roadmap for the software.<br />
<br />
All development activity is visible on GitHub: http://github.com/netarchivesuite/netarchivesuite/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/netarchivesuite/netarchivesuite/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/netarchivesuite/netarchivesuite/commits/master.atom</rss><br />
<br />
<br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|ohloh_id=NetarchiveSuite<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=WCT_(Web_Curator_Tool)&diff=2864WCT (Web Curator Tool)2016-04-07T11:28:45Z<p>Chlara: License information in Infobox & Development activity into Chapter 3</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=Web Curator Tool (WCT) is a workflow management application for selective web archiving.<br />
|image=wct-banner-500x100.jpg<br />
|homepage=http://webcurator.sourceforge.net/<br />
|license=[http://www.apache.org/licenses/LICENSE-2.0 Apache License 2.0]<br />
|platforms=Apache Tomcat<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Metadata Processing]]<br />
[[Category:Web]]<br />
[[Category:Web Crawl]]<br />
<br />
= Description =<br />
The [http://webcurator.sourceforge.net/ Web Curator Tool] (WCT) is a workflow management application for selective web archiving. WCT allows users to target websites that they wish to include in their collection, create and manage schedules to automatically harvest those sites, and package the collected files to easily submit them to a digital archive.<br />
====Provider====<br />
Developed by the National Library of New Zealand and the British Library, initiated by the International Internet Preservation Consortium. Currently maintained by Oakleigh Consulting Ltd.<br />
====Platform and interoperability====<br />
WCT was written in Java and designed to run in Apache Tomcat. It has been tested on Red Hat Linux, Solaris, and (to a lesser extent) Microsoft Windows. Three relational databases are officially supported: Oracle, MySQL and PostgreSQL. The software itself makes use of part or all of several other open-source components, including: Heritrix; Wayback; Acegi Security System; Apache Axis; Apache Commons Logging; Hibernate; Quartz; and Spring Application Framework.<br />
====Functional notes====<br />
An important functionality of the software is the ability to collect, store, and abide by harvest authorisations, i.e. permissions to download from the copyright holders. WCT also creates separate administrative levels, so that those who set up the harvests do not necessarily have the authority to have the system actively begin them. All material is captured in ARC format; since WCT incorporates Wayback, access within the system is not a problem. However, those who collect material with the ultimate goal of archiving it in a separate system must take the format into account.<br />
====Documentation and user support====<br />
The project site includes a well written quick-start guide and [http://webcurator.sourceforge.net/docs/1.5.2/Web%20Curator%20Tool%20User%20Manual%20(WCT%201.5.2).pdf user manual], although the manual includes heading sections missing the corresponding text. The site also includes a developer guide, published with release version 1.5.2. Links to the advertised wiki and FAQ sections are currently broken, forwarding to the sourceforge developer page instead. &nbsp;More information about the project can be found in an informative [http://www.ariadne.ac.uk/issue50/beresford/ article] published in Ariadne Issue 50. The primary forum for technical support appears to be an active &ldquo;webcurator-users&rdquo; mailing list. While bugs continue to be posted on the SourceForge bug/ feature request tracker, the last addressed item was in February 2011.<br />
====Usability====<br />
WCT is specifically designed to be operated by non-technical users such as librarians, with a simple and relatively intuitive GUI. Installation, however, is difficult and most likely requires tech support.<br />
====Expertise required====<br />
Setup, especially if it includes links to an archival repository, requires system administration knowledge. Users must have a comprehensive understanding of their institution&rsquo;s collections policies when designing harvests.<br />
====Standards compliance====<br />
WCT allows users to add basic Dublin Core metadata to the material.<br />
====Influence and take-up====<br />
WCT is used by the National Library of New Zealand, the National Library of Norway, and the British Library. The SourceForge site lists nearly 8,300 downloads as of December 2011.<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
No information is available on the current funding status for development, although the SourceForge site&rsquo;s bugtracker continues to list new entries and responses. WCT encourages developer participation, publishing a Developers Guide with the latest release. <br />
<br />
All development activity is visible on GitHub: http://github.com/DIA-NZ/webcurator/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/DIA-NZ/webcurator/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/DIA-NZ/webcurator/commits/master.atom</rss><br />
<br />
<br />
{{Infobox_tool_details<br />
|releases_rss=<br />
|issues_rss=https://sourceforge.net/p/webcurator/legacy-bugs/feed.rss<br />
|mailing_lists=http://webcurator.sourceforge.net/mailinglists.shtml<br />
|ohloh_id=WCT (Web Curator Tool)<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=Metadata_Extraction_Tool&diff=2863Metadata Extraction Tool2016-04-07T11:25:17Z<p>Chlara: License information in Infobox</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=Metadata Extraction Tool automatically extracts a limited set of metadata from the headers of digital files.<br />
|image=<br />
|homepage=http://meta-extractor.sourceforge.net/<br />
|license=[http://www.apache.org/licenses/LICENSE-2.0.html Apache Public License (version 2)]<br />
|platforms=Must have Java installed and enabled<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:Metadata Extraction]]<br />
<br />
<br />
= Description =<br />
The [http://natlib.govt.nz/librarians/digital-library-tools Metadata Extraction Tool] automatically extracts a limited set of metadata from the headers of digital files; it has the capability to process both individual files and larger batches. The Tool outputs this information as XML, with the goal of facilitating transfer into a preservation metadata repository.<br />
<br />
====Provider====<br />
The National Library of New Zealand (NLNZ)<br />
<br />
====Platform and interoperability====<br />
The software uses Java and XML, and has been tested in Windows and Linux/Unix environments.<br />
<br />
====Functional notes====<br />
The Metadata Extraction Tool uses a library of &lsquo;adapters&rsquo; to extract metadata for specific file types. Adapters have been created for the following formats: BMP, GIF, JPEG and TIFF; MS Word, Word Perfect, Open Office, MS Works, MS Excel, MS PowerPoint, and PDF; WAV, MP3, BFW, and FLAC; HTML and XML; and ARC. If the file type is unknown the Tool applies a generic adapter, which extracts a limited amount baseline metadata.<br />
The application opens all files as read-only, ensuring the integrity of original files.<br />
<br />
====Documentation and user support====<br />
The Tool&rsquo;s [http://meta-extractor.sourceforge.net/ Sourceforge page] includes user and installation guides, as well as a developer guide.<br />
Users can report bugs through the Sourceforge site, which also lists a contact email.<br />
<br />
====Usability====<br />
The tool has both a GUI and command line interface.<br />
<br />
====Expertise required====<br />
Installation and configuration require solid knowledge of application design and technologies. Users should have comprehensive knowledge of metadata standards and formats, particularly regarding preservation metadata.<br />
<br />
====Standards compliance====<br />
The Metadata Extraction Tool currently outputs its XML files using the NLNZ preservation metadata schema; however, the software can be configured to support other schemas.<br />
<br />
====Influence and take-up====<br />
Sourceforge statistics show approximately 38,000 downloads since 2007.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. --><br />
* '''FITS (File Information Tool Set):''' Used in [[FITS (File Information Tool Set)|FITS]]<br />
<br />
= Development Activity =<br />
Version 3.5GA was released in June 2010.<br />
The initial version of the tool was released in 2003; redevelopment for version 3 began in 2007. Contact information on the NLNZ site implies ongoing support; no information is available about ongoing development.<br />
<br />
All development activity is visible on http://sourceforge.net/projects/meta-extractor/<br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=meta-extractor<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=FITS_(File_Information_Tool_Set)&diff=2862FITS (File Information Tool Set)2016-04-07T11:22:32Z<p>Chlara: License information in Infobox</p>
<hr />
<div>{{Infobox_tool<br />
|purpose=FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.<br />
|image=<br />
|homepage=http://fitstool.org<br />
|license=[http://www.gnu.org/licenses/lgpl.html GNU Lesser General Public License]<br />
|platforms=Windows or Unix<br />
}}<br />
<br />
<!-- Delete the Categories that do not apply --><br />
[[Category:File Format Identification]]<br />
[[Category:Validation]]<br />
[[Category:Metadata Extraction]]<br />
[[Category:Encryption Detection]]<br />
<br />
= Description =<br />
[http://fitstool.org FITS] allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository. It does this by encorporating a range of mostly third-party open source tools, normalising and consolidating their output.<br />
<br />
=== Provider ===<br />
Harvard Library<br />
<br />
=== Platform and interoperability ===<br />
FITS is written in Java and is compatible with Java 1.6 or higher. <br />
It uses six external tools: <br />
* [[JHOVE (Harvard Object Validation Environment)| JHOVE]]<br />
* [[ExifTool]]<br />
* [[Metadata Extraction Tool]]<br />
* [[DROID_(Digital_Record_Object_Identification)|DROID]]<br />
* [http://web.archive.org/web/20061106114156/http://schmidt.devlib.org/ffident/index.html FFIdent]<br />
* [http://unixhelp.ed.ac.uk/CGI/man-cgi?file File Utility]<br />
<br />
A few Harvard Library-created tools; and many open source libraries.<br />
Instructions for command line use are given for Windows and Unix.<br />
<br />
=== Functional notes ===<br />
FITS acts as a wrapper, invoking and managing the output from several other open source tools. Output from these tools are converted into a common format, compared to one another and consolidated into a single XML output file. Technical metadata is only output (and a part of the consolidation process) for tools that were able to identify the file. All other output is discarded.<br />
<br />
=== Documentation and user support ===<br />
Documentation exists in the form of a user manual and more technical developer manual. <br />
The project actively uses the fits-users google group has 30 members, and is active as of January 2012. <br />
The FITS web site links to a [https://github.com/harvard-lts/fits github site] that includes the source code and an issues tracker.<br />
<br />
=== Usability ===<br />
FITS uses a command line interface; it is designed to be integrated into other software workflows, and so is aimed at those with application design experience.<br />
<br />
=== Expertise required ===<br />
Installation and configuration require deep systems administration and application design knowledge, as well as familiarity with file format and metadata standards.<br />
<br />
=== Standards compliance ===<br />
FITS outputs in XML format. A detailed description of the FITS-XML can be found [http://projects.iq.harvard.edu/fits/fits-xml here] and an analysis of the output data [http://projects.iq.harvard.edu/fits/understanding-output here].<br />
<br />
=== Influence and take-up ===<br />
The FITS website shows over 2000 downloads of the software. <br />
The tool was designed for and is in use at the Harvard Library [http://hul.harvard.edu/ois/systems/drs/ Digital Repository Service].<br />
<br />
= User Experiences =<br />
<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
FITS 0.2.0 was first released as open source in July 2009. As of April 2014 the latest release was version 0.8, released in January 2014. The tool was created to be used in Harvard's Digital Repository Service, and development is active and ongoing.<br />
<br />
All development activity is visible on GitHub: http://github.com/harvard-lts/fits/commits<br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/harvard-lts/fits/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=fits<br />
}}</div>Chlarahttps://coptr.digipres.org/index.php?title=KOST-Val&diff=2861KOST-Val2016-04-07T11:17:06Z<p>Chlara: Hyperlink to license information in Infobox</p>
<hr />
<div><!-- Use the structure provided in this template, do not change it! --><br />
<br />
{{Infobox_tool<br />
|purpose=KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).<br />
|image=KOST-Val.JPG<br />
|formats_in={{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} and SIP<br />
|homepage=http://kost-ceco.ch/cms/index.php?kost_val_de<br />
|license=[http://www.gnu.org/licenses/quick-guide-gplv3.html GNU General Public License 3+]<br />
|platforms= should run under Java 1.6 on Windows<br />
}}<br />
<br />
<!-- Add one ore more categories to describe the function of the tool. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
<br />
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --><br />
[[Category:Validation]]<br />
[[Category:Quality Assurance]]<br />
[[Category:Database]]<br />
[[Category:Image]]<br />
[[Category:Document]]<br />
[[Category:Preservation System]]<br />
<br />
= Description =<br />
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --><br />
The KOST-Val application is used to validate {{Format|TIFF}}, {{Format|SIARD}}, {{Format|PDF}}/A, {{Format|JP2}}, {{Format|JPEG}} files and Submission Information Package (SIP).<br />
<br />
KOST-Val supersedes the format validation tools [[SIARD-VAL]], [[TIFF-Val]] and SIP-Val by KOST-CECO.<br />
<br />
<br />
=== Funtional Principle ===<br />
KOST-Val complies with the following requirements.<br />
<br />
* '''TIFF validation:''' KOST-Val reads a TIFF file and uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] to validate the structure, the content, and [[ExifTool|ExifTool]] to validate the key properties such as compression, colour space, and multipage. These properties can be configured. <br />
* '''SIARD validation:''' KOST-Val reads a SIARD (eCH-0165 v1 ) file and validates the structure and the content. <br />
* '''PDF/A validation:''' KOST-Val reads a PDF or PDF/A file (ISO 19005-1 and 19005-2) and uses [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] by PDF-Tools or [[PDFTron PDF-A Manager|PDF/A Manager]] by PDFTron to validate the structure and the content of the PDF file. KOST-Val organises the different error messages into main categories such as fonts, graphics, and metadata. KOST-Val supplies only a limited version from 3-Heights™ PDF/A Validator by PDF-Tools. Module J extracts (with [[IText|iText]]) and validates the JPEG and JP2 images contained in the PDF file (depending on the configuration). It is also possible to configure whether the JBIG2 compression is accepted or not.<br />
* '''JP2 validation:''' KOST-Val reads a JP2 file (ISO 15444) and uses [[Jpylyzer]] to validate the structure and the content. <br />
* '''JPEG validation:''' KOST-Val reads a JPEG file (ISO 10918-1) and uses [[Bad Peggy]] to validate the structure and the content. <br />
* '''SIP validation:''' KOST-Val reads an SIP (eCH-0160 v1 as well as Swiss Federal Archives SFA v1 and v4 ) and validates the mandatory requirements of the SIP specification. The validated requirements are organised into groups such as folder structure, schema validation, and checksum validation. At the outset, a file format validation is performed. <br />
<br />
The results (including information on inconsistencies and errors) are output for every step and written into a validation log.<br />
The validation steps are executed sequentially. Whenever possible the validation shall continue after an error has been detected in order to reduce the number of correction cycles. <br />
<br />
[[File:KOST-Val_FuntionalPrincipleFormatValidation.JPG|800px]]<br />
<br />
<br />
=== Third-party applications ===<br />
KOST-Val uses unmodified components of other manufacturers by embedding them directly into the source code. Users of KOST-Val are requested to adhere to these components ‘terms of licence. <br />
<br />
* The TIFF validation module uses [[JHOVE (Harvard Object Validation Environment)| JHOVE]] and [[ExifTool|ExifTool]] and evaluates its output further.<br />
* For the PDF/A validation module [[PDFTron PDF-A Manager|PDF-A Manager]] or [[3-Heights(TM) PDF Validator|3-Heights™ PDF/A Validator]] are used.<br />
* The JP2 validation module uses [[Jpylyzer]] and translates the failed tests into appropriate error messages (DE/FR/EN).<br />
* The JPEG validation module uses [[Bad Peggy]] and evaluates the error message "Not a JPEG file" further.<br />
* To extract the JPEG and JP2 images from PDF/A the [[IText|iText library]] is used. <br />
* For the file format identification [[DROID_(Digital_Record_Object_Identification)|DROID]] is used. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.<br />
<br />
<br />
=== Read Me & Download ===<br />
The KOST-Val application is used to validate TIFF, SIARD, PDF/A, JP2, JPEG files and Submission Information Package (SIP).<br />
<br />
KOST-Val, Copyright (C) 2012-2015 Claire Roethlisberger (KOST-CECO), Christian Eugster, Olivier Debenath, Peter Schneider (Staatsarchiv Aargau), Markus Hahn (coderslagoon), Daniel Ludin (BEDAG AG)<br />
<br />
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; see GPL-3.0_COPYING.txt for details.<br />
<br />
You can download KOST-Val under http://github.com/KOST-CECO/KOST-Val/releases. For installation instructions please check the [http://github.com/KOST-CECO/KOST-Val/releases manual (DE/FR/EN)].<br />
<br />
<br />
=== SIARD format ===<br />
SIARD stands for Software Independent Archiving of Relational Databases. Originally the Swiss Federal Archives (SFA) have developed the SIARD format as a sustainable solution for the archiving of relations databases. <br />
<br />
In early 2013 SIARD format has been adopted as an eCH Standard (eCH-0165: SIARD format specification http://www.ech.ch/vechweb/page?p=dossier&documentNumber=eCH-0165).<br />
<br />
eCH is the Swiss organization for standardization in the field of e-government. eCH Standards define guidelines for recurring applications and their results, as for example format definitions or procedural standards. The aim of those standards is to unify and thus facilitate the electronic collaboration between authorities as well as between authorities and organizations, educational and research institutions, firms and private organizations.<br />
<br />
<br />
=== Future ===<br />
See http://github.com/KOST-CECO/KOST-Val/issues <br />
<br />
<br />
=== Feedback & Issues ===<br />
Feedback about KOST-Val is very welcome at http://github.com/KOST-CECO/KOST-Val/issues or kost-val[at]kost-ceco.ch<br />
<br />
.<br />
<br />
= User Experiences =<br />
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->* '''ZBW:'''<br />
** KOST-Val v1.6.0<br />
** The tool is very easy to install and to handle. <br />
** The output in xml-Format (open in a browser to have a table) is easy to understand<br />
** Running the JPEG-Module against almost 2,400 JPEGs has only lasted 7 minutes<br />
** The tool recognises fake-JPEGs (jpeg-extension, but no jpeg inside) and issues with jpegs and can differentiate easily between these two cases.<br />
<br />
.<br />
<br />
= Development Activity =<br />
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --><br />
All development activity is visible on GitHub: http://github.com/KOST-CECO/KOST-Val/commits<br />
<br />
<br />
=== Release Feed ===<br />
Below the last 3 release feeds:<br />
<rss max=3>https://github.com/KOST-CECO/KOST-Val/releases.atom</rss><br />
<br />
<br />
=== Activity Feed ===<br />
Below the last 5 commits:<br />
<rss max=5>https://github.com/KOST-CECO/KOST-Val/commits/master.atom</rss><br />
<br />
{{Infobox_tool_details<br />
|ohloh_id=<br />
}}</div>Chlara