Document
Jump to navigation
Jump to search
Tools for this content type
| 3-Heights(TM) PDF Validator | 3-Heights(TM) PDF Validator from PDF-Tools AG. | X | |||||||
| ADIGRES | ADIGRES is a powerful cross-platform Document Management System written in Java. | X | X | X | |||||
| Antiword | Antiword is a free MS Word reader for Linux and RISC OS. | X | |||||||
| Apache PDFBox | JAVA PDF library for creation, manipulation, validation and content extraction of PDF documents | X | X | X | |||||
| Apache POI - the Java API for Microsoft Documents | The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2). | X | X | X | |||||
| ArchiFiltre | Overview of folder trees with fine diagrams | X | X | X | |||||
| BorgFormat | A web application and service that combines multiple tools for format identification and validation. | X | |||||||
| CSV Validator | Validation of CSV files against user-defined schema | X | |||||||
| Calibre | An e-book management tool, including viewer, migration, and file conversion features among others. | X | |||||||
| Catdoc & xls2csv | catdoc is a program that reads one or more Microsoft Word files and outputs text to standard output. | X | |||||||
| Converseen | A GUI for ImageMagick supporting mass operations | X | |||||||
| DFG Viewer | Browser-based viewer for digital objects | X | |||||||
| Dependency Discovery Tool | The Dependency Discovery Tool searches through binary office files (.doc, .xls and .ppt) and tries to find any documents or files that are linked to the document. | X | |||||||
| Developer Tools in QA: Novice's Toolkit | A collaborative document which non-developers can adapt to record QA methods using built-in browser developer tools. | X | X | ||||||
| DiPS (Digital Preservation Solution) | DiPS (OAIS compliant Digital Preservation Solution) | X | X | X | X | X | X | X | |
| DiscMaster | Website to browse and search vintage computer files from archive.org | X | |||||||
| EpubCheck | Validator for EPUB files | X | X | ||||||
| Exempi | Exempi is a library for handling XMP metadata, based on the Adobe XMP SDK | X | X | ||||||
| ExifTool | Properties extraction, identification, metadata editing | X | X | X | |||||
| Filestar | Universal file converter for 900+ file types. | X | X | X | |||||
| Flint | Validates a file against a policy, using common validation tools | X | |||||||
| GImageReader | A customisable GUI for Tesseract | X | |||||||
| IText | PDF library for manipulation, content extraction and creation | X | X | ||||||
| KOST-Val | KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP). | X | X | ||||||
| Kraken | Open Source turn-key OCR system forked from ocropus | X | |||||||
| LegacyFileConverter | Converts document formats to RTF | X | |||||||
| Library of Congress Newspaper Viewer | The Library of Congress Newspaper Viewer is a web application used to ingest and view digitized newspaper pages meeting the National Digital Newspaper Program specification. | X | |||||||
| Libreoffice | An office suite with command line options for PDF/A conversions | X | |||||||
| Libsafe | libsafe allows the organizations to create a full OAIS compliant Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes. | X | |||||||
| Limb Processing | Software for processing, enhancing and converting cultural heritage into digital cultural heritage | X | X | ||||||
| LuraDocument PDF Compressor | LuraDocument PDF Compressor is a document conversion engine. | X | |||||||
| METS Navigator | METS-based system for displaying and navigating sets of page images or other multi-part digital objects. | X | X | ||||||
| MPP Viewer | MPP Viewer is a viewer for Microsoft Project files | X | X | ||||||
| Metadata Interrogator | The Metadata Interrogator is a standalone, offline GUI tool for extracting and analysing metadata from a wide variety of file formats. | X | X | ||||||
| Metadata++ | Freeware tool to view, edit, modify, extract, copy metadata of various formats. | X | X | ||||||
| Metadata2Go | Web-based EXIF data viewer | X | X | ||||||
| Nitro Pro | A PDF handling tool including PDF/A | X | |||||||
| ODF Validator | ODF Validator is a tool that validates OpenDocument files and checks them for certain conformance criteria. | X | X | ||||||
| Officeparser.py | officerparser.py is a python script that parses the format of OLE compound documents used by Microsoft Office applications. | X | X | ||||||
| PDF Tools (by Didier Stevens) | Tools for parsing and analysing PDF documents | X | X | ||||||
| PDFTron PDF-A Manager | PDF/A Manager is a PDF/A (ISO 19005) validation and conversion software. | X | X | ||||||
| PDFsam | PDFsam splits and merges PDF files | X | X | X | |||||
| Pandoc | A universal converter that converts files from one markup format into another | X | |||||||
| PdfaPilot | pdfaPilot: Conversion of documents and emails into robust, searchable PDF or PDF/A files | X | X | X | |||||
| Pdfcpu | A Go library and command line tool for PDF processing incl. validation | X | X | ||||||
| Pdftk | PDF manipulation tool | X | X | X | |||||
| Peepdf | peepdf is a Python tool to explore PDF files in order to find out if the file can be harmful or not. | X | X | ||||||
| Python XMP Toolkit | Library for working with XMP metadata, as well as reading/writing XMP metadata stored in many different file formats | X | X | ||||||
| Qpdf | QPDF is a command-line program that does structural, content-preserving transformations on PDF files | X | X | X | |||||
| Rescarta | The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. | X | X | ||||||
| Sumatra PDF | Open source, fast, PDF and eBook viewer | X | |||||||
| Tabula | Extract tabular data from PDF files | X | |||||||
| VeraPDF | PDF/A validation tool | X | |||||||
| Veridian | Online search, discovery, and display of digitized newspaper collections | X | |||||||
| WordHoard | WordHoard is an application for the close reading and scholarly analysis of deeply tagged texts. | X | X | X | |||||
| Xpdf | Open source PDF viewer that includes PDF information extractor and font analyzer | X | X | X | |||||
| Yara | YARA is a tool that allows the identification of files that match user-defined textual or binary patterns | X | X |