Difference between revisions of "Tesseract-ocr"

From COPTR
Jump to navigation Jump to search
(Trial import from script.)
Line 13: Line 13:
  
 
= Description =
 
= Description =
Quoted from the website: “The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.”
+
Tesseract is probably the most accurate open source OCR engine available. Combined with the [http://leptonica.com/ Leptonica Image Processing Library] it can read a wide variety of image formats and convert them to text in over 60 languages. It was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but since then it has been improved extensively by Google.
  
 
= User Experiences =
 
= User Experiences =

Revision as of 18:47, 3 October 2014

Open source OCR engine, accepting uncompressed TIFF files as input
Homepage:http://code.google.com/p/tesseract-ocr/
License:Apache 2.0 License EXCEPT the tesseractTrainer.py, which is licensed with GPL


Description

Tesseract is probably the most accurate open source OCR engine available. Combined with the Leptonica Image Processing Library it can read a wide variety of image formats and convert them to text in over 60 languages. It was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but since then it has been improved extensively by Google.

User Experiences

Applied in an AQuA Mashup that resulted in the Solution page: Compare OCR results of the same source material in different formats (TIFF, JP2)

Development Activity

Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt673f8aab0ff3b5_42717254