Metadata Extraction Tool

Jump to: navigation, search

Metadata Extraction Tool automatically extracts a limited set of metadata from the headers of digital files.
License:Apache Public License (version 2)
Platforms:Must have Java installed and enabled


[edit] Description

The Metadata Extraction Tool automatically extracts a limited set of metadata from the headers of digital files; it has the capability to process both individual files and larger batches. The Tool outputs this information as XML, with the goal of facilitating transfer into a preservation metadata repository.

[edit] Provider

The National Library of New Zealand (NLNZ)

[edit] Platform and interoperability

The software uses Java and XML, and has been tested in Windows and Linux/Unix environments.

[edit] Functional notes

The Metadata Extraction Tool uses a library of ‘adapters’ to extract metadata for specific file types. Adapters have been created for the following formats: BMP, GIF, JPEG and TIFF; MS Word, Word Perfect, Open Office, MS Works, MS Excel, MS PowerPoint, and PDF; WAV, MP3, BFW, and FLAC; HTML and XML; and ARC. If the file type is unknown the Tool applies a generic adapter, which extracts a limited amount baseline metadata. The application opens all files as read-only, ensuring the integrity of original files.

[edit] Documentation and user support

The Tool’s Sourceforge page includes user and installation guides, as well as a developer guide. Users can report bugs through the Sourceforge site, which also lists a contact email.

[edit] Usability

The tool has both a GUI and command line interface.

[edit] Expertise required

Installation and configuration require solid knowledge of application design and technologies. Users should have comprehensive knowledge of metadata standards and formats, particularly regarding preservation metadata.

[edit] Standards compliance

The Metadata Extraction Tool currently outputs its XML files using the NLNZ preservation metadata schema; however, the software can be configured to support other schemas.

[edit] Influence and take-up

Sourceforge statistics show approximately 38,000 downloads since 2007.

[edit] User Experiences

  • FITS (File Information Tool Set): Used in FITS

[edit] Development Activity

Version 3.5GA was released in June 2010. Latest release 3.6GA is from 2014.

The initial version of the tool was released in 2003; redevelopment for version 3 began in 2007. Contact information on the NLNZ site implies ongoing support; no information is available about ongoing development.

All development activity is visible on


Chlara (25.3%), COPTR Bot (74.7%)