Batch Convert Text to PDF with Page Size Formatting

PEERNET is pleased to add a new batch convert text to PDF feature to Document Conversion Service. Our new converter intelligently detects the paper size and the ASCII, Unicode, or other character set needed for each text document. This ensures that the content is perfectly scaled and presented on the page. For those who require a more tailored approach, there are options for custom paper sizes, fonts, and word wrapping, providing flexibility and control over converting text to PDF.

Who Needs to Batch Convert Text to PDF?

Text documents are still a popular file format in use in the medical, legal, financial, and insurance fields. The need to batch convert these text documents to PDF is increasing as these industries transition to digital content and document management. This need will continue to grow as companies start incorporating AI and machine learning technologies for data and trend analysis and to facilitate administrative and other professional tasks.

Using the Text – Builtin Converter for Text Files

The new text converter, Text – Builtin, replaces our original technique that used Word to convert text files. This approach works but lacks customization. It can also require manual steps to handle non-standard page sizes and wide-format text files.

When you update to the latest version of Document Conversion Service, this converter is already enabled and ready to handle your text files. To check this, use the DCS Dashboard to go to DCS Settings, then Edit DCS Configuration. Set the Text – Builtin converter to Auto or On. If it was Off, you need to restart the Document Conversion Service to pick up the changes.

Convert Text to PDF When the Text File Has Formfeeds

Text files have existed since the early days of computing and the advent of disk storage in the 1960s. They are still an essential way to store data today. During this era of dot matrix line printers, the need to manage printed output and be able to separate text content into pages led to the introduction of the ASCII form feed character.

This character, ASCII code 12 (0x0C in hexadecimal, FF in text editors), tells the printer to move to the top of the next page and continue printing. Form feeds became an integral part of operating systems, programming languages, and text editors for controlling the layout of printed text. It remains in use today.

When you convert text to PDF, the form feed character in your text file signals when to start a new page. Now that we know when to start a page, the next step is determining the page size needed to display our text.

Using the Courier New font at 10 points, we scan the text in the document to calculate the page size using the form feed as the end-of-page marker.  With our calculated page size, we look for the closest paper size match in the Document Conversion Service printer’s forms list. We use the matching page size when creating the PDF file. When no match is found, the default action is to use a Letter-sized page (8.5in x 11in) in the PDF file.

Convert Text to PDF For Text Without Formfeeds

When a text document has no form feeds, there is no indication of where a page break is supposed to be. In this case, the entire document is essentially a single page.

Using the Courier New font at 10 points, we scan the text in the document to calculate the closest matching page size in the Document Conversion Service printer’s forms list. Intelligent pre-sets control that this page size is not smaller than 8in x 8in. When there is no match, we split the text onto a Letter-sized (8.5in x 11in) page.

Setting the Text Page Size

The smart AutoDetect mode in our text converter means you don’t have to worry about the page size and cut-off text in your PDF files. Depending on the lines of text and how wide they are, we find a matching page size or default to a Letter-sized (8.5in x 11in) page.

In cases where you want to use a pre-defined page size, our UseSpecified mode allows you to set a custom page size, An additional mode, SizeToFit, uses the text and formfeeds, if any, to calculate a page size. There is no attempt to match a standard paper size with this setting.

Changing the Font and Font Size

The font and font size used to render your text to PDF affects your page size. The default font used is Courier New, and the default size is 10 points (about 1/14 of an inch tall). You can change both the font and the text size used.

Choose a fixed-width (monospaced) font for optimal results. Monospaced fonts use the same amount of space to display each character. Using the same space for each character is particularly useful for text documents containing tables and columns of data.

Word Wrapping

Word wrapping automatically shifts the remaining text to a new line after reaching the maximum number of characters per line. Three modes determine where the break in the text occurs when this shift happens.

  • None – Do not line wrap. When not line wrapping, any text that occurs past the maximum number of characters per line is truncated.
  • Plain – This wrap can occur in the middle of a word. There is no attempt to backtrack to find the last space between words to break at.
  • Word – wraps the text at the calculated number of characters per line. This mode backtracks to find the last space between words and breaks at the space when necessary.

Text documents that include form feeds wrap long lines of text using the None mode. A text document without formfeeds will default to Word mode. You can customize this to suit the format of your text files.

Adding New Text File Extensions

Most text documents have the file extension TXT, but many other types of files are also just text files. Log files, Excel Comma Delimited Files, XML files, HTML files, and programming source code are just a few types of files that are, in essence, just a text file. A good rule of thumb is if Notepad can open the file, it is a text file.

We’ve already mapped many common text file extensions to our Text – Builtin converter, and you can add as many more as you need.

To add a new text file extension mapping, go to the DCS Dashboard – DCS Settings – Edit File Extension Map. Scroll down to find the section of extensions for the Text – Builtin converter and add your new text file extension

