File Format Identification
Jump to navigation
Jump to search
Tools for this function
| Tool | Purpose | ToolStatus |
|---|---|---|
| Apache Tika | Java based tool for identifying file formats using signatures and extracting metadata and text content from documents. | |
| BorgFormat | A web application and service that combines multiple tools for format identification and validation. | |
| Cloc | Cloc (Count Lines of Code) serves not only to count the lines of Code,but also guesses the programming language, thus can be used to identify files. It is a command line tool which is easy to use. | |
| Crazy-fast-image-scan | A script to scan media very quickly to find out what kind of content it contains | |
| DROID (Digital Record Object Identification) | DROID (Digital Record Object Identification) is a software tool developed to perform automated batch identification of file formats. | |
| DUMPBIN Utility | The DUMPBIN utility, which is provided with the 32-bit version of Microsoft Visual C++, combines the abilities of the LINK, LIB, and EXEHDR utilities. | |
| DiPS (Digital Preservation Solution) | DiPS (OAIS compliant Digital Preservation Solution) | |
| DiskFormatID | Identify floppy disk formats from kryoflux stream files | |
| Duke Data Accessioner | Data Accessioner provides a graphical user interface to aid in migrating data from physical media to a dedicated file server, documenting the process and using MD5 checksums to identify any errors introduced in transfer. | |
| FFAStrans | Task automation engine, mostly used in audio and video visual content management. | |
| FIDO (Format Identification for Digital Objects) | A PRONOM based, command line, file format identification tool written in Python | |
| FITS (File Information Tool Set) | FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository. | |
| File Format Identification Pronom | Perl API to analyze and handle droid (PRONOM) signatures | |
| Filestar | Universal file converter for 900+ file types. | |
| Fine Free File Command | This is the home page for the open source implementation of the file(1) command that ships with every free operating system (OpenBSD, Linux, NetBSD, FreeBSD, etc. | |
| Fq | Tool, language and decoders for working with binary data. | |
| Gvfs-info | gvfs-info - print information about files and directories | |
| JHOVE (Harvard Object Validation Environment) | JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects. | |
| JHOVE2 | JHOVE2 allows data curators to characterise the digital objects in their repositories. | |
| JSONID | Identification of JSON, YAML, and TOML document types | |
| Libmagic-dev | This library can be used to classify files according to magic number tests. | |
| Libsharedmime | This is an implementation for libsharedmime. | |
| MediaConch | MediaConch is a file validation software. | |
| NARA File Analyzer and Metadata Harvester | NARA File Analyzer and Metadata Harvester allows a user to analyze the contents of a file system or external drive and generates statistics about the contents of the contained directories. | |
| Nanite | A friendly swarm of format-identifying robots | |
| Officeparser.py | officerparser.py is a python script that parses the format of OLE compound documents used by Microsoft Office applications. | |
| Ohcount | Analyses plain text files, looking for code (scripting languages etc.) | |
| PRONOM Signature Development Utility | Output DROID compatible file format signature files using PRONOM syntax | |
| Puremagic | Puremagic is a cross-platform pure python module that will identify a file based off it's magic numbers | |
| Siegfried | A PRONOM based, command line, file format identification tool using Aho Corasick matching and no buffer limits. | |
| TrID File Identifier | TrID is a utility designed to identify file types from their binary signatures. | |
| Web Archive Discovery | Indexing and discovery tools for web archives. |