Difference between revisions of "FITS (File Information Tool Set)"

From COPTR
Jump to navigation Jump to search
(Trial import from script.)
 
(24 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{Infobox_tool
+
{{Infobox tool
 
|purpose=FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.
 
|purpose=FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.
|image=
+
|homepage=http://fitstool.org
|homepage=http://code.google.com/p/fits/
+
|license=[http://www.gnu.org/licenses/lgpl.html GNU Lesser General Public License]
|license=GNU Lesser General Public License
 
 
|platforms=Windows or Unix
 
|platforms=Windows or Unix
 +
|function=File Format Identification, Validation, Metadata Extraction, Encryption Detection
 
}}
 
}}
 +
{{Infobox tool details
 +
|ohloh_id=fits
 +
}}
 +
= Description =
 +
The File Information Tool Set ([http://fitstool.org FITS]) identifies, validates and extracts technical metadata for a wide range of file formats. It acts as a wrapper, invoking and managing the output from several other open source tools. Output from these tools are converted into a common format, compared to one another and consolidated into a single XML output file.
 +
 +
=== Provider ===
 +
Harvard Library
  
<!-- Delete the Categories that do not apply -->
+
=== Platform and interoperability ===
[[Category:File Format Identification]]
+
FITS is written in Java and is compatible with Java 1.7 or higher.
[[Category:Validation]]
+
The tools used in version 1.3.0 of FITS are:
[[Category:Metadata Extraction]]
+
* [[ADL Tool]]
 
+
* [[Apache Tika]]
 +
* [[DROID_(Digital_Record_Object_Identification)|DROID]]
 +
* [[ExifTool]]
 +
* [http://web.archive.org/web/20061106114156/http://schmidt.devlib.org/ffident/index.html FFIdent] (no longer maintained)
 +
* [[http://unixhelp.ed.ac.uk/CGI/man-cgi?file File Utility] (windows port)
 +
* [[JHOVE (Harvard Object Validation Environment)| JHOVE]]
 +
* [[MediaInfo]]
 +
* [[Metadata_Extraction_Tool]]
 +
* [[OIS Audio Information]]
 +
* [[OIS File Information]]
 +
* [[OIS XML Information]]
  
= Description =
 
[http://code.google.com/p/fits/ FITS] allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository. It does this by encorporating a range of mostly third-party open source tools, normalising and consolidating their output.
 
====Provider====
 
Harvard University Library Office for Information Systems
 
====Licensing and cost====
 
[http://www.gnu.org/licenses/gpl.html GNU Lesser GPL] &ndash; free.
 
====Development activity====
 
FITS 0.2.0 was released in October 2011.
 
The tool was created to be used in Harvard&rsquo;s Digital Repository Service, and so presumably development is active and ongoing.
 
====Platform and interoperability====
 
FITS is written in Java and is compatible with Java 1.6 or higher. It uses six external tools: [http://www.dcc.ac.uk/resources/external/jhove JHOVE], Exiftool, [http://www.dcc.ac.uk/resources/external/metadata-extraction-tool National Library of New Zealand Metadata Extractor], [http://www.dcc.ac.uk/resources/external/DROID DROID], FFIdent, and Windows File Utility.&nbsp;
 
 
Instructions for command line use are given for Windows and Unix.
 
Instructions for command line use are given for Windows and Unix.
====Functional notes====
+
 +
=== Functional notes ===
 
FITS acts as a wrapper, invoking and managing the output from several other open source tools. Output from these tools are converted into a common format, compared to one another and consolidated into a single XML output file. Technical metadata is only output (and a part of the consolidation process) for tools that were able to identify the file. All other output is discarded.
 
FITS acts as a wrapper, invoking and managing the output from several other open source tools. Output from these tools are converted into a common format, compared to one another and consolidated into a single XML output file. Technical metadata is only output (and a part of the consolidation process) for tools that were able to identify the file. All other output is discarded.
====Documentation and user support====
+
Documentation exists in the form of a project [http://code.google.com/p/fits/w/list wiki], including information for users, installers, and developers. &nbsp;The materials are written for an IT, rather than library/archive audience.&nbsp;
+
=== Documentation and user support ===
The project actively uses the fits-users google group has 30 members, and is active as of January 2012. &nbsp;The FITS wiki also includes an issues tracker.
+
Documentation exists in the form of a user manual and more technical developer manual.  
====Usability====
+
The project actively uses the fits-users google group has 78 (Sept. 24, 2018) members, and is active as of January 2012.  
FITS uses a command line interface; it is designed to be integrated into other software workflows, and so is aimed at those with application design experience.
+
The FITS web site links to a [https://github.com/harvard-lts/fits github site] that includes the source code and an issues tracker.
====Expertise required====
+
 +
=== Usability ===
 +
FITS uses a command line interface; it is designed to be integrated into other software workflows, and so is aimed at those with application design experience. In January 2016, version 1.0.0 of the FITS [https://projects.iq.harvard.edu/fits/downloads#fits-servlet web service] was released, giving the ability to deploy FITS as a service.
 +
 +
=== Expertise required ===
 
Installation and configuration require deep systems administration and application design knowledge, as well as familiarity with file format and metadata standards.
 
Installation and configuration require deep systems administration and application design knowledge, as well as familiarity with file format and metadata standards.
====Standards compliance====
+
FITS outputs in XML format.&nbsp;
+
=== Standards compliance ===
====Influence and take-up====
+
FITS outputs in XML format. A detailed description of the FITS-XML can be found [http://projects.iq.harvard.edu/fits/fits-xml here] and an analysis of the output data [http://projects.iq.harvard.edu/fits/understanding-output here].
The FITS website shows over 1500 downloads of the software. &nbsp;The tool was designed for and is in use at the Harvard Digital Repository Service.
+
 
 +
=== Influence and take-up ===
 +
The FITS website shows over 2000 downloads of the software.  
 +
The tool was designed for and is in use at the Harvard Library [http://hul.harvard.edu/ois/systems/drs/ Digital Repository Service].
  
 
= User Experiences =
 
= User Experiences =
 +
<!-- Add hotlinks to user experiences with the tool (eg. blog posts). These should illustrate the effectiveness (or otherwise) of the tool. Use a bullet list. -->
 +
* '''National Archives of the Netherlands:'''
 +
** How [http://openpreservation.org/blog/2017/05/17/preservation-impact-assessments-how-preservation-tools-support-naneths-connection-projects/ FITS] supports pre-ingest impact assessments at the National Archives of the Netherlands.
 +
* '''DNB (German National Library):'''
 +
** FITS v0.6.1 with modified [[JHOVE (Harvard Object Validation Environment)| Jhove 1.11]], Gentoo Linux, Java-Environment, Tomcat-Application Server
 +
** Since the end of 2012, DNB uses the FITS library as a part of its risk management within the automated ingest process. At present more than 1500 files are daily examined by FITS.
 +
*** The purpose of the risk management and its implementation with metadata tools like FITS or JHOVE is to facilitate automatic technical quality checking (bitstream integrity and validation) of each digital publications. Furthermore, the analysis is aimed at recognising technical restrictions such as DRM at an early stage, which hinder or even prevent the task of long-term preservation and use of the digital objects.
 +
*** The extracted technical metadata (the FITS output) are used further for future long-term preservation measures such as format migration and are stored and managed in the metadata management of the long-term archive of the DNB. The capture of these metadata is essential in order to execute targeted migration measures of files in endangered formats.
 +
*** FITS also offers significant benefit in the form of easily configurable standardisation of the different tool outputs into the FITS format using XSLT. The DNB has used this function to adapt the FITS output to its own requirements, e.g. incorporating other metadata elements not included in the FITS distribution into the standardisation.
 +
*** A further adjustment, which the DNB has made, is the integration of a DNB tool to analyse files in ePub format.
  
 +
* '''ZBW (German National Library of Economics):'''  [https://wiki.dnb.de/display/NESTOR/ZBW+user+experience+with+FITS Link to the user experience of the ZBW]
  
 
= Development Activity =
 
= Development Activity =
 
+
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. -->
{{Infobox_tool_details
+
FITS 0.2.0 was first released as open source in July 2009. As of April 2014 the latest release was version 0.8, released in January 2014. The tool was created to be used in Harvard's Digital Repository Service, and development is active and ongoing.
|ohloh_id=FITS
+
}}
+
All development activity is visible on GitHub: http://github.com/harvard-lts/fits/commits
 +
 +
 +
=== Release Feed ===
 +
Below the last 3 release feeds:
 +
<rss max=3>https://github.com/harvard-lts/fits/releases.atom</rss>
 +
 +
 +
=== Activity Feed ===
 +
Below the last 5 commits:
 +
<rss max=5>https://github.com/harvard-lts/fits/commits/master.atom</rss>

Latest revision as of 14:35, 22 April 2021



FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.
Homepage:http://fitstool.org
License:GNU Lesser General Public License
Platforms:Windows or Unix
Function:File Format Identification,Validation,Metadata Extraction,Encryption Detection
Appears in COW:Workflow for ingesting digitized books into a digital archive



Description[edit]

The File Information Tool Set (FITS) identifies, validates and extracts technical metadata for a wide range of file formats. It acts as a wrapper, invoking and managing the output from several other open source tools. Output from these tools are converted into a common format, compared to one another and consolidated into a single XML output file.

Provider[edit]

Harvard Library

Platform and interoperability[edit]

FITS is written in Java and is compatible with Java 1.7 or higher. The tools used in version 1.3.0 of FITS are:

Instructions for command line use are given for Windows and Unix.

Functional notes[edit]

FITS acts as a wrapper, invoking and managing the output from several other open source tools. Output from these tools are converted into a common format, compared to one another and consolidated into a single XML output file. Technical metadata is only output (and a part of the consolidation process) for tools that were able to identify the file. All other output is discarded.

Documentation and user support[edit]

Documentation exists in the form of a user manual and more technical developer manual. The project actively uses the fits-users google group has 78 (Sept. 24, 2018) members, and is active as of January 2012. The FITS web site links to a github site that includes the source code and an issues tracker.

Usability[edit]

FITS uses a command line interface; it is designed to be integrated into other software workflows, and so is aimed at those with application design experience. In January 2016, version 1.0.0 of the FITS web service was released, giving the ability to deploy FITS as a service.

Expertise required[edit]

Installation and configuration require deep systems administration and application design knowledge, as well as familiarity with file format and metadata standards.

Standards compliance[edit]

FITS outputs in XML format. A detailed description of the FITS-XML can be found here and an analysis of the output data here.

Influence and take-up[edit]

The FITS website shows over 2000 downloads of the software. The tool was designed for and is in use at the Harvard Library Digital Repository Service.

User Experiences[edit]

  • National Archives of the Netherlands:
    • How FITS supports pre-ingest impact assessments at the National Archives of the Netherlands.
  • DNB (German National Library):
    • FITS v0.6.1 with modified Jhove 1.11, Gentoo Linux, Java-Environment, Tomcat-Application Server
    • Since the end of 2012, DNB uses the FITS library as a part of its risk management within the automated ingest process. At present more than 1500 files are daily examined by FITS.
      • The purpose of the risk management and its implementation with metadata tools like FITS or JHOVE is to facilitate automatic technical quality checking (bitstream integrity and validation) of each digital publications. Furthermore, the analysis is aimed at recognising technical restrictions such as DRM at an early stage, which hinder or even prevent the task of long-term preservation and use of the digital objects.
      • The extracted technical metadata (the FITS output) are used further for future long-term preservation measures such as format migration and are stored and managed in the metadata management of the long-term archive of the DNB. The capture of these metadata is essential in order to execute targeted migration measures of files in endangered formats.
      • FITS also offers significant benefit in the form of easily configurable standardisation of the different tool outputs into the FITS format using XSLT. The DNB has used this function to adapt the FITS output to its own requirements, e.g. incorporating other metadata elements not included in the FITS distribution into the standardisation.
      • A further adjustment, which the DNB has made, is the integration of a DNB tool to analyse files in ePub format.

Development Activity[edit]

FITS 0.2.0 was first released as open source in July 2009. As of April 2014 the latest release was version 0.8, released in January 2014. The tool was created to be used in Harvard's Digital Repository Service, and development is active and ongoing.

All development activity is visible on GitHub: http://github.com/harvard-lts/fits/commits


Release Feed[edit]

Below the last 3 release feeds:

2022-05-10 21:38:32
[tag:github.com,2008:Repository/8527901/1.5.5 Release 1.5.5]
by daveneiman
2022-05-03 15:49:49
[tag:github.com,2008:Repository/8527901/1.5.4 Release 1.5.4]
by daveneiman
2022-05-02 15:02:02
[tag:github.com,2008:Repository/8527901/1.5.3 Release 1.5.3]
by daveneiman


Activity Feed[edit]

Below the last 5 commits: Failed to load RSS feed from https://github.com/harvard-lts/fits/commits/master.atom: There was a problem during the HTTP request: 404 Not Found