Document
Jump to navigation
Jump to search
Tools for this content type
3-Heights(TM) PDF Validator | 3-Heights(TM) PDF Validator from PDF-Tools AG. | X | |||||||
ADIGRES | ADIGRES is a powerful cross-platform Document Management System written in Java. | X | X | X | |||||
Antiword | Antiword is a free MS Word reader for Linux and RISC OS. | X | |||||||
Apache PDFBox | JAVA PDF library for creation, manipulation, validation and content extraction of PDF documents | X | X | X | |||||
Apache POI - the Java API for Microsoft Documents | The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2). | X | X | X | |||||
ArchiFiltre | Overview of folder trees with fine diagrams | X | X | X | |||||
BorgFormat | A web application and service that combines multiple tools for format identification and validation. | X | |||||||
CSV Validator | Validation of CSV files against user-defined schema | X | |||||||
Calibre | An e-book management tool, including viewer, migration, and file conversion features among others. | X | |||||||
Catdoc & xls2csv | catdoc is a program that reads one or more Microsoft Word files and outputs text to standard output. | X | |||||||
Converseen | A GUI for ImageMagick supporting mass operations | X | |||||||
DFG Viewer | Browser-based viewer for digital objects | X | |||||||
Dependency Discovery Tool | The Dependency Discovery Tool searches through binary office files (.doc, .xls and .ppt) and tries to find any documents or files that are linked to the document. | X | |||||||
Developer Tools in QA: Novice's Toolkit | A collaborative document which non-developers can adapt to record QA methods using built-in browser developer tools. | X | X | ||||||
DiPS (Digital Preservation Solution) | DiPS (OAIS compliant Digital Preservation Solution) | X | X | X | X | X | X | X | |
DiscMaster | Website to browse and search vintage computer files from archive.org | X | |||||||
EpubCheck | Validator for EPUB files | X | X | ||||||
Exempi | Exempi is a library for handling XMP metadata, based on the Adobe XMP SDK | X | X | ||||||
ExifTool | Properties extraction, identification, metadata editing | X | X | X | |||||
Filestar | Universal file converter for 900+ file types. | X | X | X | |||||
Flint | Validates a file against a policy, using common validation tools | X | |||||||
GImageReader | A customisable GUI for Tesseract | X | |||||||
IText | PDF library for manipulation, content extraction and creation | X | X | ||||||
KOST-Val | KOST-Val is an open source validator for different file formats and Submission Information Package (SIP). | X | X | ||||||
Kraken | Open Source turn-key OCR system forked from ocropus | X | |||||||
LegacyFileConverter | Converts document formats to RTF | X | |||||||
Library of Congress Newspaper Viewer | The Library of Congress Newspaper Viewer is a web application used to ingest and view digitized newspaper pages meeting the National Digital Newspaper Program specification. | X | |||||||
Libreoffice | An office suite with command line options for PDF/A conversions | X | |||||||
Libsafe | libsafe allows the organizations to create a full OAIS compliant Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes. | X | |||||||
Limb Processing | Software for processing, enhancing and converting cultural heritage into digital cultural heritage | X | X | ||||||
LuraDocument PDF Compressor | LuraDocument PDF Compressor is a document conversion engine. | X | |||||||
METS Navigator | METS-based system for displaying and navigating sets of page images or other multi-part digital objects. | X | X | ||||||
MPP Viewer | MPP Viewer is a viewer for Microsoft Project files | X | X | ||||||
Metadata Interrogator | The Metadata Interrogator is a standalone, offline GUI tool for extracting and analysing metadata from a wide variety of file formats. | X | X | ||||||
Metadata++ | Freeware tool to view, edit, modify, extract, copy metadata of various formats. | X | X | ||||||
Metadata2Go | Web-based EXIF data viewer | X | X | ||||||
Nitro Pro | A PDF handling tool including PDF/A | X | |||||||
ODF Validator | ODF Validator is a tool that validates OpenDocument files and checks them for certain conformance criteria. | X | X | ||||||
Officeparser.py | officerparser.py is a python script that parses the format of OLE compound documents used by Microsoft Office applications. | X | X | ||||||
PDF Tools (by Didier Stevens) | Tools for parsing and analysing PDF documents | X | X | ||||||
PDFTron PDF-A Manager | PDF/A Manager is a PDF/A (ISO 19005) validation and conversion software. | X | X | ||||||
PDFsam | PDFsam splits and merges PDF files | X | X | X | |||||
Pandoc | A universal converter that converts files from one markup format into another | X | |||||||
PdfaPilot | pdfaPilot: Conversion of documents and emails into robust, searchable PDF or PDF/A files | X | X | X | |||||
Pdfcpu | A Go library and command line tool for PDF processing incl. validation | X | X | ||||||
Pdftk | PDF manipulation tool | X | X | X | |||||
Peepdf | peepdf is a Python tool to explore PDF files in order to find out if the file can be harmful or not. | X | X | ||||||
Python XMP Toolkit | Library for working with XMP metadata, as well as reading/writing XMP metadata stored in many different file formats | X | X | ||||||
Qpdf | QPDF is a command-line program that does structural, content-preserving transformations on PDF files | X | X | X | |||||
Rescarta | The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. | X | X | ||||||
Sumatra PDF | Open source, fast, PDF and eBook viewer | X | |||||||
Tabula | Extract tabular data from PDF files | X | |||||||
VeraPDF | PDF/A validation tool | X | |||||||
Veridian | Online search, discovery, and display of digitized newspaper collections | X | |||||||
WordHoard | WordHoard is an application for the close reading and scholarly analysis of deeply tagged texts. | X | X | X | |||||
Xpdf | Open source PDF viewer that includes PDF information extractor and font analyzer | X | X | X | |||||
Yara | YARA is a tool that allows the identification of files that match user-defined textual or binary patterns | X | X |