Difference between revisions of "Kraken"

Jump to navigation Jump to search
Line 4: Line 4:
|license=Apache 2.0 License
|license=Apache 2.0 License
|formats_out=ALTO (Analyzed Layout and Text Object)
|content=ALTO format
|content=Image, Document
{{Infobox tool details}}
{{Infobox tool details}}

Latest revision as of 14:54, 8 June 2021

Open Source turn-key OCR system forked from ocropus
License:Apache 2.0 License
Output Formats:ALTO (Analyzed Layout and Text Object)
Content type:Image,Document


kraken is a turn-key OCR system forked from ocropus. It is intended to rectify a number of issues while preserving (mostly) functional equivalence.

main features:

  • Script detection and multi-script recognition support
  • Right-to-Left, BiDi, and Top-to-Bottom script support
  • ALTO, abbyXML, and hOCR output
  • Word bounding boxes and character cuts
  • Public repository of model files
  • Lightweight model files
  • Variable recognition network architectures

All functionality not pertaining to OCR and prerequisite steps has been removed, i.e. no more error rate measuring, etc.

User Experiences[edit]

Development Activity[edit]

Commits : https://github.com/mittagessen/kraken/commits

Issues : https://github.com/mittagessen/kraken/issues