FITS (File Information Tool Set)

Jump to: navigation, search

FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.
License:GNU Lesser General Public License
Platforms:Windows or Unix


[edit] Description

FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository. It does this by encorporating a range of mostly third-party open source tools, normalising and consolidating their output.

[edit] Provider

Harvard Library

[edit] Platform and interoperability

FITS is written in Java and is compatible with Java 1.6 or higher. It uses six external tools:

A few Harvard Library-created tools; and many open source libraries. Instructions for command line use are given for Windows and Unix.

[edit] Functional notes

FITS acts as a wrapper, invoking and managing the output from several other open source tools. Output from these tools are converted into a common format, compared to one another and consolidated into a single XML output file. Technical metadata is only output (and a part of the consolidation process) for tools that were able to identify the file. All other output is discarded.

[edit] Documentation and user support

Documentation exists in the form of a user manual and more technical developer manual. The project actively uses the fits-users google group has 30 members, and is active as of January 2012. The FITS web site links to a github site that includes the source code and an issues tracker.

[edit] Usability

FITS uses a command line interface; it is designed to be integrated into other software workflows, and so is aimed at those with application design experience.

[edit] Expertise required

Installation and configuration require deep systems administration and application design knowledge, as well as familiarity with file format and metadata standards.

[edit] Standards compliance

FITS outputs in XML format. A detailed description of the FITS-XML can be found here and an analysis of the output data here.

[edit] Influence and take-up

The FITS website shows over 2000 downloads of the software. The tool was designed for and is in use at the Harvard Library Digital Repository Service.

[edit] User Experiences

  • DNB (German National Library):
    • FITS v0.6.1 with modified Jhove 1.11, Gentoo Linux, Java-Environment, Tomcat-Application Server
    • Since the end of 2012, DNB uses the FITS library as a part of its risk management within the automated ingest process. At present more than 1500 files are daily examined by FITS.
      • The purpose of the risk management and its implementation with metadata tools like FITS or JHOVE is to facilitate automatic technical quality checking (bitstream integrity and validation) of each digital publications. Furthermore, the analysis is aimed at recognising technical restrictions such as DRM at an early stage, which hinder or even prevent the task of long-term preservation and use of the digital objects.
      • The extracted technical metadata (the FITS output) are used further for future long-term preservation measures such as format migration and are stored and managed in the metadata management of the long-term archive of the DNB. The capture of these metadata is essential in order to execute targeted migration measures of files in endangered formats.
      • FITS also offers significant benefit in the form of easily configurable standardisation of the different tool outputs into the FITS format using XSLT. The DNB has used this function to adapt the FITS output to its own requirements, e.g. incorporating other metadata elements not included in the FITS distribution into the standardisation.
      • A further adjustment, which the DNB has made, is the integration of a DNB tool to analyse files in ePub format.

[edit] Development Activity

FITS 0.2.0 was first released as open source in July 2009. As of April 2014 the latest release was version 0.8, released in January 2014. The tool was created to be used in Harvard's Digital Repository Service, and development is active and ongoing.

All development activity is visible on GitHub:

[edit] Release Feed

Below the last 3 release feeds:

2017-08-16 20:03:27
[,2008:Repository/8527901/1.2.0 ZIP file processing; bug fixes]
by daveneiman
2017-08-09 20:26:45
[,2008:Repository/8527901/1.2.0-RC2 1.2.0-RC2: Merge pull request #153 from daveneiman/assorted_release_fixes]
by daveneiman
2017-07-05 20:36:21
[,2008:Repository/8527901/1.2.0-RC1 1.2.0-RC1: Merge pull request #148 from daveneiman/modify_format_tree]
by daveneiman

[edit] Activity Feed

Below the last 5 commits:

2017-08-09 20:26:45
[,2008:Grit::Commit/d20abbf56dc907f7f378df3391c8ecccf3b06936 Merge pull request #153 from daveneiman/assorted_release_fixes]
by daveneiman
2017-08-09 20:20:37
[,2008:Grit::Commit/534a7cc0dea97b656868ff61f9af95750ea43e3a Add Adobe Illustrator files to format tree; normalize mimetype; add t…]
by daveneiman
2017-08-09 20:19:11
[,2008:Grit::Commit/98dbe4287fcfc1f0fbc789cba64021aebefbe197 update OTS JAR file]
by daveneiman
2017-08-09 15:33:47
[,2008:Grit::Commit/dc98aafed9188fb3a97cd7c5feded06ad0d5b505 NLNZ tool: don't identify file as EXIF if empty EXIF XML element.]
by daveneiman
2017-08-08 21:12:08
[,2008:Grit::Commit/1ac8baa30c6dcecadaf29c563deeefbdc4722be6 Add m4a file extension to DROID exclusion list in fits.xml files.]
by daveneiman


Shein (24.4%), Yfriese (11.2%), Chlara (15.3%), Prwheatley (3.4%), Andrea Goethals (13.2%), COPTR Bot (32.4%)