Difference between revisions of "Docworks"
m |
Prwheatley (talk | contribs) |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | {{Infobox tool | |
− | + | |purpose=Document digitization workflow software | |
− | {{ | ||
− | |purpose=Document digitization workflow software | ||
|homepage=https://content-conversion.com/#docworks-2 | |homepage=https://content-conversion.com/#docworks-2 | ||
|license=Commercial | |license=Commercial | ||
|platforms=Windows | |platforms=Windows | ||
+ | |formats_in=TIFF, JP2, JPEG, GIF, PDF | ||
+ | |formats_out=ALTO (Analyzed Layout and Text Object), METS (Metadata Encoding and Transmission Standard), PDF, PDF/A, EPUB | ||
+ | |function=OCR, Quality Assurance, Workflow | ||
+ | |content=Image, Metadata | ||
}} | }} | ||
− | + | {{Infobox tool details}} | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Description == | == Description == | ||
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --> | <!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --> | ||
Line 38: | Line 25: | ||
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --> | <!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --> | ||
<!-- Add the OpenHub.com ID for the tool, if known. | <!-- Add the OpenHub.com ID for the tool, if known. | ||
− | + | --> | |
− | |||
− | |||
− | |||
− | |||
− |
Latest revision as of 11:26, 9 June 2021
Description[edit]
docWorks helps archives and content owners to convert their print holdings into professional digital libraries. This process consists of two steps: the digitization, i.e. the scanning of the printed page, and the conversion, i.e. the recognition of all contained text, image, layout and structural information.
docWorks is a conversion software that covers all conversion steps in a single workflow. It provides layout analysis and offers multiple OCR engines to handle any type of publication, language or writing system.
Import formats are TIF, JPG, JP2, GIF and PDF and you can export METS and ALTO XML, image files, PDF, PDF/A-1, full-text XML, RTF and EPUB. Metadata schemes are MIX, MARC, MODS, DC, METS physical structural maps and METS logical structural maps.