Please enable JavaScript to view this site.

Raster Image Printer

Navigation: » No topics above this level «

Optical Character Recognition (OCR) with Raster Image Printer

Scroll Prev Top Next More

Optical Character Recognition, or OCR for short, searches for and recognizes text (characters) on scanned pages or images and extracts it as digital text. With this digital text, we can create searchable PDF files from PDF documents containing scanned pages or images. When creating images, the OCR process saves the recognized text as your choice of hOCR, Text, and ALTO OCR files. You can choose to save these files when creating PDF files as well.

When recognizing text, the OCR engine has to know which languages to look for on the page. OCR works by analyzing the patterns, shapes, and curves of the text characters on the page and matching them to predefined information for different characters in each language. It assigns a confidence score for each language, with the highest score determining the language chosen.

Outside factors such as image quality, the font used, and any image background on the pages will all affect the validity of the OCR results.

You can jump directly to a topic by selecting a link below, or you can refer to the table of contents.

Use OCR to Create Searchable PDF Files

Use OCR to Extract Text When Creating Images