Optical Character Recognition, or OCR for short, searches for and recognizes text (characters) on scanned pages or images and extracts it as digital text. When creating your TIFF images, the OCR process saves the recognized text as your choice of hOCR, Text, and ALTO OCR files.
When recognizing text, the OCR engine has to know which languages to look for on the page. OCR works by analyzing the patterns, shapes, and curves of the text characters on the page and matching them to predefined information for different characters in each language. It assigns a confidence score for each language, with the highest score determining the language chosen.
Outside factors such as image quality, the font used, and any image background on the pages will all affect the validity of the OCR results.
TIFF Image Printer includes the following languages. If you need additional languages, you can download them by following the instructions on the OCR settings tab.
•Arabic
•English
•French
•German
•Hebrew
•Hindi
•Italian
•Spanish
To learn more about the new OCR features see the section Optical Character Recognition (OCR) with TIFF Image Printer.