Batch Converting Thousands of Files
What if you had several thousands of files you need to convert to TIFF or PDF?
How would you convert all of these files with Document Conversion Service?
The following steps show how to configure the Watch Folder service that is included with Document Conversion Service to safely convert a large volume of files and save them to a new location. Before you start converting your files, the Watch Folder configuration file will need to be modified as the initial configuration for the Watch Folder service is to use a drop folder where groups of files are dropped periodically into the input folder to be processed. When files are detected, whether 1 file or 10,000 files, they are all moved to a staging folder to be processed. When dealing with extremely large number of files, this can cause large time delays as the files are moved, and other issues such as not having enough disk space to copy the files.
Steps for Configuring Watch Folder Service
Open the Watch Folder configuration file by going to Start – All Programs – Document Conversion Service 3.0 – Watch Folder – Configure Watch Folder Settings. Scroll down to the “LargeBatchTIFF Watch Folder” section :
- Set the InputFolder to point to your copy of the folder of files to be converted.
- Set the OutputFolder to where you want to save the converted files. Remember, the OutputFolder is where all of the converted files are going so you need to make sure this location has enough space.
- Set the FailedFolder to where you want to store any files that fail to convert.
- If you need the Watch Folder service to keep a copy of your input data, set the CompletedFolder location. As this is where all of the files from the InputFolder will eventually be copied to as they are converted, make sure you have enough disk space to hold a complete copy of your input folder. If you do not need to keep your input files, set the CompletedFolder to an empty string.
- Set the StagingFolder and WorkingFolder locations – these folders hold the input files and output files temporarily while they are being converted.
- The settings Polling.MaxFilesToProcessAtATime and Polling.SynchronousFilePickup are used to control how many files are picked up at every polling interval, and if the first batch of files needs to complete before the next group is picked up.
- Set Polling.MaxFilesToProcessAtATime to the number of files you want the service to pick up each time.
- Set Polling.SynchronousFilePickup to true to have the service wait for the first mini-batch to complete before picking up new files.
- To turn off the use of the date-time subfolder for each batch of files, set UseTimeDateSubFoldersInCompletedFolder and UseTimeDateSubFoldersInFailedFolder to false.
- The next setting to change, and this is optional depending on your requirements, is if you wish the original file extension to be part of the created file name. If you change the setting
<add Name ="Save;Remove filename extension" Value ="1"/>then the original file extension will not be used to name the output file. This means that the output file from a file named SampleFile.xlsx would become SampleFile.tif. If this setting is not included, or is set to “0”, the output file name would be SampleFile.xlsx.tif.
- Under the <Output file options> section, enter the settings for the type of file you wish to create. For more information on the parameters and variables, see Conversion Settings in the User Guide. In our example, Watch Folder is set to create a 300 dpi, black and white, multipaged TIFF file.
- Save the Watch Folder configuration file.
- Start the Document Conversion Service and then the Watch Folder Service.
- The Watch Folder service will now start processing files from your input folder. Only the number of files you set for Polling.MaxFilesToProcessAtATime will be picked up at a time and then processed, until all files in the folder have been converted.
<!-- This watch folder is set to allow for dropping a large number of files --> <!-- at once. The files are picked up in small batches of up to 10 files until --> <!-- all files have been completed. --> <WatchFolder Name="LargeBatchTIFF Watch Folder"> <Settings> <!-- Folder options --> <add Name="InputFolder" Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Input"/> <add Name="SearchFilter" Value="*.*"/> <add Name="IncludeSubFolders" Value="True"/> <add Name="DeleteInputSubFolders" Value="True"/> <add Name="StagingFolder" Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Staging"/> <add Name="WorkingFolder" Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Working"/> <add Name="FailedFolder" Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Failed"/> <!-- Set Completed Folder if we want to keep the Input Files --> <add Name="CompletedFolder" Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Completed"/> <!-- Set Completed Folder to empty to not keep the Input Files --> <!-- <add Name="CompletedFolder" Value=""/> --> <add Name="OutputFolder" Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Output"/> <add Name="PollingInterval" Value="15000"/> <add Name="DCOMComputerName" Value="localhost"/> <add Name="TestMode" Value="false"/> <add Name="NormalizeFilenames" Value="false"/> <add Name="CopyInstructionsFromResources" Value="ReadMe_LargeBatchTIFF"/> <!-- These settings control the how many files in the batch --> <!-- are picked up each time, 0 means no limit. --> <add Name="Polling.MaxFilesToProcessAtATime" Value="10" /> <add Name="Polling.SynchronousFilePickup" Value="true" /> <add Name="UseTimeDateSubFoldersInCompletedFolder" Value="false" /> <add Name="UseTimeDateSubFoldersInFailedFolder" Value="false" /> <!-- Output file options --> <add Name="Devmode settings;Resolution" Value="300"/> <add Name="Save;Output File Format" Value="TIFF Multipaged"/> <add Name="Save;Append" Value="0"/> <add Name="Save;Color reduction" Value="BW"/> <add Name="Save;Dithering method" Value="Halftone"/> <!-- This creates file.ext.tif, change to 1 to create file.tif--> <add Name="Save;Remove filename extension" Value="1" /> <add Name="TIFF File Format;BW compression" Value="Group4"/> <add Name="TIFF File Format;Color compression" Value="LZW RGB"/> <add Name="TIFF File Format;Indexed compression" Value="LZW"/> <add Name="TIFF File Format;Greyscale compression" Value="LZW"/> <add Name="JPEG File Format;Color compression" Value="Medium Quality"/> <add Name="JPEG File Format;Greyscale compression" Value="High Quality"/> </Settings> </WatchFolder>