• Contact
  • Company
  • Login / My Account
  • Shopping Cart (0)
Document Conversion Made Easy!
Peernet Menu
  • Products
      • Virtual Printers
        • tiff-image-printer-iconTIFF Image Printer – Create TIFF Images
        • raster-image-printer-iconRaster Image Printer – Create TIFF, PDF, JPEG, etc.
        • pdf-image-printer-iconPDF Image Printer – Create Searchable PDF
      • PDF Editor
        • pdf-creator-plus-iconPDF Creator Plus – Merge, Edit, Create Searchable PDF
      • Batch Converters
        • document-conversion-service-iconDocument Conversion Service – Unattended 24/7 Batch Converter
        • file-conversion-center-iconFile Conversion Center – Desktop Batch Converter
      • Reporting Software
        • peernet-reports-iconPEERNET Reports – Barcode, Report and Label Software
      • enterprise-licensingEnterprise Licensing for your Corporation
      • discounts-multiple-licensesDiscounts for Purchasing Multiple Licenses
      • distribute-bundle-peernet-softwareDistribute PEERNET Software Bundled with your Product
  • Purchase
      • Purchase Virtual Printers
        • tiff-image-printer-iconTIFF Image Printer – Create TIFF Images
        • raster-image-printer-iconRaster Image Printer – Create TIFF, PDF, JPEG etc.
        • pdf-image-printer-iconPDF Image Printer – Create Searchable PDF
      • Purchase PDF Editor
        • pdf-creator-plus-iconPDF Creator Plus – Merge, Edit, Create Searchable PDF
      • Purchase Batch Converters
        • document-conversion-service-iconDocument Conversion Service – Unattended 24/7 Batch Converter
        • file-conversion-center-iconFile Conversion Center – Desktop Batch Converter
      • Purchase Reporting Software
        • peernet-reports-iconPEERNET Reports – Barcode, Report and Label Software
      • peernet-online-store-purchase-optionsPurchase Options
      • peernet-software-license-levelsLicense Levels
      • peernet-software-purchase-resellerFind Resellers
      • peernet-software-sales-faqsSales FAQ
  • Learn & Support
        • peernet-help-centerTutorials
          • Learn the Basics or Go Beyond with Video Tutorials, FAQs and Guides

            At PEERNET we pride ourselves on providing the best support and the fastest response times in the industry.
          • Select Software Tutorials:
              • tiff-image-printer-iconTIFF Image Printer
              • raster-image-printer-iconRaster Image Printer
              • pdf-image-printer-iconPDF Image Printer
              • pdf-creator-plus-iconPDF Creator Plus
              • document-conversion-service-iconDocument Conversion Service
              • file-conversion-center-iconFile Conversion Center
              • peernet-reports-iconPEERNET Reports
        • peernet-software-faqsSales FAQ
          • Popular Topics

            Find all the answers you need to our most frequently asked questions.
              • Download & Install
                • How do I download software I already purchased?
              • Purchasing & Renewing
                • How do I purchase PEERNET software?
                • What license level do I need?
                • How do I add licenses to an existing serial number?
                • How do I renew my annual subscription?
              • Licensing & Operating
                • How do I activate my new PEERNET software?
                • How do I activate my software without an internet connection?
                • Where do I find my serial number?
                • How do I move my software to a new computer?
                • How do I update/upgrade my software to the latest release?
            • Read all Sales FAQs
  • Blog
  • Menu Menu

Optical Character Recognition (OCR) with PEERNET

December 6, 2023/by Robert Massart
New feature optical character recognition to create searchable pdf and extract text from images.

We here at PEERNET are thrilled to announce the addition of Optical Character Recognition (OCR) to our family of virtual printer drivers: TIFF Image Printer, Raster Image Printer, and PDF Image Printer.

Say hello to dynamic, searchable PDF files and editable text files from images.

Images traditionally presented a challenge when you needed to extract the text. With optical character recognition, our printers can transform images into searchable documents. Your printed and scanned documents can effortlessly become searchable PDF files or editable hOCR, text, and ALTO files you can search, organize, and integrate into your business workflow.

What is Optical Character Recognition (OCR)?

First, what is Optical Character Recognition, or OCR? Also, what exactly happens when you OCR a PDF or an image?

OCR examines the image using pattern recognition and artificial intelligence. It looks for patterns, shapes, and spacing that resemble text characters, lines, and paragraphs. It matches the shapes against predefined patterns for each language chosen to recognize. Each character match is assigned a confidence level or score. The highest score determines which character in which language matches.

As part of the recognition process, optical character recognition also identifies and stores information about the layout and formatting of the text on the image.

Using this information, we can create searchable PDF files. Or, if we are creating TIFF, JPEG, or other images, we can bundle the text and layout information with the corresponding image for document management systems to make images searchable.

Other uses for this information are in archives to convert printed material into a digital format, by researchers for data mining and text analysis, and by accessibility tools.

Searchable PDF Files Using Optical Character Recognition

A searchable PDF file is one where you can search for and locate text on the page. You can select the text on the page and, if allowed, copy and paste it into other documents.

Scanned PDF documents are often just images of each page wrapped as a PDF file. You cannot search or copy the text in these files. The easiest way to tell if a page is a scanned image is to try and select the text. If it is an image, you cannot select any text on the page, only an area of the page, as shown here.

When you print a scanned PDF file to our printers, OCR will add the text on the pages as an invisible layer you can search and copy.

Optical character recognition is not just for PDF files. Print and append images together to convert images into a searchable PDF. You can make the images in a PDF searchable, just like the text. Printing a PDF containing text and pictures creates a new PDF where the original text and text in pictures are now searchable.

Do you already have a scanned PDF you want to make searchable? Follow our quick steps in Convert PDF to Searchable PDF with OCR to see how easy it is.

Optical Character Recognition Can Extract Text From Images

When creating TIFF, PNG, JPEG, and other images using the TIFF Image Printer and Raster Image Printer, you use the optical character recognition feature to extract the text to digital hOCR, text, and ALTO files. There is no way to embed the text information into an image the way you can with a PDF.

hOCR is a specially formatted XHTML file containing the text extracted from the page. It stores format and layout information and a score for how confident the OCR engine is on its match.

An ALTO file is similar to hOCR but stores the information using a different structure and specification.

Both formats contain human-readable text extracted from the image and work with different OCR tools and applications.

A text file, however, contains only the text extracted from the image. It does not attempt to mimic the layout of the text on the image. This file contains only plain, human-readable, and editable text that you can edit in any text editor.

This format is perfect when you need to copy the text from the image into another document. You can also use it to perform text recognition on the image as part of a document archiving step.

Our tutorial, Create TIFF and Extract Text From Images Using OCR, shows how easy it is to extract text from images in a single step.

Recognizing Different Languages

Our optical character recognition uses one or more language datasets to match the shapes and curves on the page to text characters.

We recognize and match shapes against the English language dataset to start. The installation also includes the French, Italian, German, Spanish, Arabic, Hebrew, and Hindu language datasets.

You control which languages to look for when performing optical character recognition. The best practice is to check only the languages you know are in your document. The more languages to test, the longer the process takes.

Need other languages? No problem. There are over 100 language datasets you can download and add. You’re sure to find the language you need.

Ready to Try It?

Adding optical character recognition to our virtual printers marks a significant milestone in our commitment to providing solutions to our users. We’re excited to be able to offer this feature to you. We hope you will join us as we enter into the world of OCR and explore the world of possibilities it unlocks.

Download a trial of PDF Image Printer, Raster Image Printer, or TIFF Image Printer and try it out today! Already own one of our image printers? Log in to your online account to download the latest version and see what OCR can do.

https://www.peernet.com/wp-content/uploads/OCR-with-PEERNET.jpg 800 800 Robert Massart https://www.peernet.com/wp-content/uploads/peernet-logo.png Robert Massart2023-12-06 10:45:002024-05-17 13:50:27Optical Character Recognition (OCR) with PEERNET
  • Document Conversion Service
  • TIFF Image Printer
  • Raster Image Printer
  • PDF Image Printer
  • PDF Creator Plus
  • File Conversion Center
  • PEERNET Reports
Search Search

Recent Posts

  • PNSrv11Lib to PNSrv12Lib: Migration Made Easy
  • Migrating to Version 12: Compatibility Mode Quick Start Guide
  • Well Logs: Stitch PDF Pages into a Continuous TIFF Image
  • Dynamic Stamp Content
  • Convert to PDF: The Power of On-Premise PDF Creation

INTERESTING LINKS

Below are some interesting links for you! Enjoy your stay :)

RSS Feed Logo RSS Feed Logo Subscribeto RSS Feed

OUR PRODUCTS

  • Document Conversion Service
  • TIFF Image Printer
  • Raster Image Printer
  • PDF Image Printer
  • PDF Creator Plus
  • File Conversion Center
  • PEERNET Reports

LATEST NEWS

  • PNSrv11Lib to PNSrv12Lib: Migration Made EasyMarch 14, 2025 - 2:10 pm
  • Migrating to Version 12: Compatibility Mode Quick Start GuideMarch 14, 2025 - 2:09 pm
  • Well Logs: Stitch PDF Pages into a Continuous TIFF ImageMarch 14, 2025 - 2:08 pm
  • Dynamic Stamp ContentNovember 4, 2024 - 4:47 pm

BUSINESS INFORMATION

Toll Free: 1-800-883-7980 North America

Tel: 1-613-224-6894

Our office hours are Monday to Friday, from 0900 hrs to 1700 hrs, Eastern Standard Time.

Email Address: [email protected]
Copyright © 1997-2026. All rights reserved. Terms and Conditions | Disclaimer | Privacy Policy | Trademarks.
PEERNET® is a registered trademark of PEERNET Inc.
  • Link to Youtube
  • Link to Rss this site
  • Products
  • Purchase
  • Company
  • Contact
Link to: Batch Convert Text to PDF with Page Size Formatting Link to: Batch Convert Text to PDF with Page Size Formatting Batch Convert Text to PDF with Page Size Formatting Link to: Convert PDF to Searchable PDF with OCR Link to: Convert PDF to Searchable PDF with OCR Convert PDF to Searchable PDF with OCR
Scroll to top Scroll to top Scroll to top