Editing JHOVE2

Jump to navigation Jump to search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
{{Infobox tool
+
{{Infobox_tool
 
|purpose=JHOVE2 allows data curators to characterise the digital objects in their repositories.
 
|purpose=JHOVE2 allows data curators to characterise the digital objects in their repositories.
 +
|image=
 
|homepage=http://jhove2.org/
 
|homepage=http://jhove2.org/
 
|license=JHOVE2 is made freely available under the termsof the BSD open source license for all project-developed code; some third-party libraries may be covered by other open source licences.
 
|license=JHOVE2 is made freely available under the termsof the BSD open source license for all project-developed code; some third-party libraries may be covered by other open source licences.
|formats_in=PREMIS (Preservation Metadata Implementation Strategies)
+
|platforms=
|function=Encryption Detection, File Format Identification, Metadata Extraction, Validation
 
}}
 
{{Infobox tool details
 
|ohloh_id=JHOVE2
 
 
}}
 
}}
 +
 +
<!-- Delete the Categories that do not apply -->
 +
[[Category:Validation]]
 +
[[Category:Metadata Extraction]]
 +
[[Category:File Format Identification]]
 +
 +
 
= Description =
 
= Description =
[https://bitbucket.org/jhove2/main/wiki/Home JHOVE2] is a follow-on to the Harvard/JSTOR [[JHOVE (Harvard Object Validation Environment)| JHOVE]] project, with the similar purpose of allowing data curators to characterise the digital objects in their repositories. &nbsp;Characterisation is comprised of four elements: first, identifying the object&rsquo;s format; second, validating that the object conforms to its format&rsquo;s technical norms; third, extracting technical metadata from the object; and fourth, assessing whether the object should be accepted into a repository, based on policies set by the curator. &nbsp;
+
[https://bitbucket.org/jhove2/main/wiki/Home JHOVE2] is a follow-on to the Harvard/JSTOR [http://www.dcc.ac.uk/resources/external/jhove JHOVE] project, with the similar purpose of allowing data curators to characterise the digital objects in their repositories. &nbsp;Characterisation is comprised of four elements: first, identifying the object&rsquo;s format; second, validating that the object conforms to its format&rsquo;s technical norms; third, extracting technical metadata from the object; and fourth, assessing whether the object should be accepted into a repository, based on policies set by the curator. &nbsp;
 
The software was designed to be able to integrate with other applications to enable easy incorporation into a repository&rsquo;s Ingest workflow.
 
The software was designed to be able to integrate with other applications to enable easy incorporation into a repository&rsquo;s Ingest workflow.
 
====Provider====
 
====Provider====
Line 23: Line 27:
 
Developers wishing to rebuild JHOVE2 from the provided source will need a full Java SE 6 development kit and the Apache Maven project tool.
 
Developers wishing to rebuild JHOVE2 from the provided source will need a full Java SE 6 development kit and the Apache Maven project tool.
 
====Functional notes====
 
====Functional notes====
The JHOVE2 project came about as a response to perceived shortcomings in the [[JHOVE (Harvard Object Validation Environment)| JHOVE]] software. JHOVE2 separates identification from validation, allowing the software to identify objects even if they are not valid. &nbsp;This also provides the opportunity to use the [http://www.dcc.ac.uk/resources/external/pronom PRONOM]&nbsp;registry in signature-based identification via integration with [[DROID_(Digital_Record_Object_Identification)|DROID]], creating the ability to identify many more format-types than those for which it has validation modules. Other improvements include the ability to characterize hierarchical digital objects such as directories, zip files and bit streams nested within files, and a design that allows easier integration with other applications.
+
The JHOVE2 project came about as a response to perceived shortcomings in the [http://www.dcc.ac.uk/resources/external/jhove JHOVE] software. JHOVE2 separates identification from validation, allowing the software to identify objects even if they are not valid. &nbsp;This also provides the opportunity to use the [http://www.dcc.ac.uk/resources/external/pronom PRONOM]&nbsp;registry in signature-based identification via integration with [http://www.dcc.ac.uk/resource/external/droid DROID], creating the ability to identify many more format-types than those for which it has validation modules. Other improvements include the ability to characterize hierarchical digital objects such as directories, zip files and bit streams nested within files, and a design that allows easier integration with other applications.
 
JHOVE2 has validation modules for the following format types: ICC color profile; SGML; Shapefile; TIFF (including TIFF/EP, TIFF-FX, TIFF/IT, Exif, GeoTIFF, DNG and RFC 1314); UTF-8 encoded text; WAVE (including Broadcast Wave); XML; ZIP; GZIP; ARC; WARC; and arbitrary bytestreams, filesets and directories. Modules for JPEG 2000 (JP2 and JPX profiles) and PDF (including PDF/X and PDF/A) were planned but have not been implemented yet. &nbsp;
 
JHOVE2 has validation modules for the following format types: ICC color profile; SGML; Shapefile; TIFF (including TIFF/EP, TIFF-FX, TIFF/IT, Exif, GeoTIFF, DNG and RFC 1314); UTF-8 encoded text; WAVE (including Broadcast Wave); XML; ZIP; GZIP; ARC; WARC; and arbitrary bytestreams, filesets and directories. Modules for JPEG 2000 (JP2 and JPX profiles) and PDF (including PDF/X and PDF/A) were planned but have not been implemented yet. &nbsp;
 
For comparison, ICC, SGML, Shapefile, ZIP, GZIP, ARC and WARC are newly supported in JHOVE2; however, JHOVE supports AIFF, GIF, JPEG, JPEG2000 and PDF while JHOVE2 does not. &nbsp;HTML is also not supported in JHOVE2, as it is in JHOVE, but since HTML can be expressed in terms of SGML or XML the functionality remains the same.&nbsp;
 
For comparison, ICC, SGML, Shapefile, ZIP, GZIP, ARC and WARC are newly supported in JHOVE2; however, JHOVE supports AIFF, GIF, JPEG, JPEG2000 and PDF while JHOVE2 does not. &nbsp;HTML is also not supported in JHOVE2, as it is in JHOVE, but since HTML can be expressed in terms of SGML or XML the functionality remains the same.&nbsp;
Line 31: Line 35:
 
====Usability====
 
====Usability====
 
JHOVE2 does not include a GUI, which will be challenging for many users.
 
JHOVE2 does not include a GUI, which will be challenging for many users.
The default output (e. g. in xml, txt or json) is very talkative and can contain up to 3500 lines for one TIFF file.
 
 
====Expertise required====
 
====Expertise required====
 
Installation requires solid knowledge of command line interfaces and experience with manually editing configuration files. Creation of the assessment policies requires detailed knowledge of digital preservation standards and technologies. &nbsp;
 
Installation requires solid knowledge of command line interfaces and experience with manually editing configuration files. Creation of the assessment policies requires detailed knowledge of digital preservation standards and technologies. &nbsp;
Line 40: Line 43:
  
 
= User Experiences =
 
= User Experiences =
Please note that JHOVE2 cannot cope with any empty spaces in the command line. Therefor, JHOVE2 has to be stored in a folder which can be typed in without any empty space.
 
  
As the output is extremely wordy and contains so much information that it is difficult to tell if a certain TIFF file is valid or not, it might be helpfull to configure the output options. This is possible in the sgml-file. It might proove to be difficult for the average non-SGML-expert to handle the file.
 
  
 
= Development Activity =
 
= Development Activity =
 +
 +
{{Infobox_tool_details
 +
|ohloh_id=JHOVE2
 +
}}
 +
 
=== Activity Feed ===
 
=== Activity Feed ===
 +
 
Link to any RSS feed that is updated when issue or code updates occur, if any, e.g:
 
Link to any RSS feed that is updated when issue or code updates occur, if any, e.g:
 
<rss max=7>http://bitbucket.org/jhove2/main/rss</rss>
 
<rss max=7>http://bitbucket.org/jhove2/main/rss</rss>
  
 
+
=== Release Feed ===
=== About Formats ===
 
JHOVE2 do only read the format itself, not the spesification as such
 

Please note that all contributions to COPTR are considered to be released under the Attribution-ShareAlike 3.0 Unported (see COPTR:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)