Please enable JavaScript to view this site.

Raster Image Printer

Navigation: What's New in Version 12

Optical Character Recognition (OCR)

Scroll Prev Top Next More

Optical Character Recognition, or OCR for short, searches for and recognizes text (characters) on scanned pages or images and extracts it as digital text. With this digital text, we can create searchable PDF files from PDF documents containing scanned pages or images. When creating images, the OCR process saves the recognized text as your choice of hOCR, Text, and ALTO OCR files. You can choose to save these files when creating PDF files as well.

When recognizing text, the OCR engine has to know which languages to look for on the page. OCR works by analyzing the patterns, shapes, and curves of the text characters on the page and matching them to predefined information for different characters in each language. It assigns a confidence score for each language, with the highest score determining the language chosen.

Outside factors such as image quality, the font used, and any image background on the pages will all affect the validity of the OCR results.

Raster Image Printer includes the following languages. If you need additional languages, you can download them by following the instructions on the OCR settings tab.

Arabic

English

French

German

Hebrew

Hindi

Italian

Spanish

 

To learn more about the new OCR features see the section Optical Character Recognition (OCR) with Raster Image Printer.