Large Volume Batch Conversion Using Synchronous File Pickup

 

Deprecated - Use Clustering Instead

This method of converting very large folder of files is deprecated in favor of using clustered processing to automatically pick up the files from the source folder without having to copy large numbers of files. Clustering is more efficient and allows for higher throughput. See Large Volume Batch Conversion Using Clustering.

 

The watch folder, LargeBatchTIFF, included with the Watch Folder Service, is configured to handle folder containing a large number of files. It picks up a maximum number of files each time, converts them and returns to continue processing through the collection files until all files are complete.

The Watch Folder Service basic design was for use with hot folders or drop folders where files to be converted are dropped periodically into a folder. It was meant to handle small groups of files being dropped occasionally into the input folder. When files are detected in the input folder, the Watch Folder Service will try and copy the entire contents of the folder to its staging location for processing. When dealing with a folder containing a large volume of files this can cause large time delays as the files are copied, and other issues such as not having enough disk space to copy the files.

To allow for processing folders containing a very large number of files, the settings Polling.MaxFilesToProcessAtATime and Polling.SynchronousFilePickup were added. These settings are used to control how many files are picked up at every polling interval, and if the first batch of files needs to complete before the next group is picked up.

In this scenario, you would also typically set UseTimeDateSubFoldersInCompletedFolder and UseTimeDateSubFoldersInFailedFolder to false so that the date-timestamp folders for each mini-batch of files are not created under the output and failed folders.

You may also want to add the setting <add Name ="Save;Remove filename extension" Value ="1"/> to make sure that the file extension from the original source file is not used to name the output file. This means that the output file from a file named Manual.docx would become Manual.tif. If this settings is not included, or is set to "0", the output file name would be Manual.docx.pdf.

As an extra precaution, if possible, we recommend making a copy of the original source files and processing off of the copied. This ensures you still have your original collection of files if anything unexpected should happen during the conversion process.

 

 

Code Sample

 

<WatchFolders>
 

  <!-- This watch folder is set to allow for dropping a large number of files -->

  <!-- at once. The files are picked up in small batches of up to 10 files until -->

  <!-- all files have been completed. -->
  <WatchFolder Name="LargeBatchTIFF Watch Folder" >
    <Settings>

      <!-- Folder options -->

      <add Name="Enabled" Value="True"/>
      <add Name="InputFolder" 

            Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Input\"/>
      <add Name="SearchFilter" Value="*.*"/>
      <add Name="IncludeSubFolders" Value="True"/>

      <add Name="DeleteInputSubFolders" Value="True"/>

      <add Name="StagingFolder" 

            Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Staging"/>

       <add Name="WorkingFolder" 

            Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Working"/>

      <add Name="FailedFolder" 

            Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Failed"/>

      <add Name="CompletedFolder"

            Value="C:\PEERNET\WatchFolders\LargeBatchTIFF\Completed"/>
      <add Name="OutputFolder" 

            Value=":\PEERNET\WatchFolders\LargeBatchTIFF\Output"/>
      <add Name="PollingInterval" Value="15000"/>

      <add Name="DCOMComputerName" Value=""/>
      <add Name="TestMode" Value="false" />

 

       <!-- These settings control the how many files in the batch -->

       <!-- are picked up each time, 0 means no limit. -->

       <add Name="Polling.MaxFilesToProcessAtATime" Value="10" />

       <add Name="Polling.SynchronousFilePickup" Value="true" />

 

       <add Name="UseTimeDateSubFoldersInCompletedFolder" Value="false" />

       <add Name="UseTimeDateSubFoldersInFailedFolder" Value="false" />

       ....

       <add Name="Devmode settings;Resolution" Value="300"/>

      <add Name="Save;Output File Format" Value="TIFF Multipaged" />

       <!-- Replace the above with this to create serialized images. -->

       <!-- <add Name="Save;Output File Format" Value="TIFF Serialized" /> -->

     

       <add Name="Save;Append" Value="0"/>

       <add Name="Save;Color reduction" Value="Optimal"/>

       <add Name="Save;Dithering method" Value="Halftone"/>

 

       <!-- This creates file.ext.tif, change to 1 to create file.tif -->  

       <add Name="Save;Remove filename extension" Value="1"/>

 

      <add Name="TIFF File Format;BW compression" Value="Group4"/>

      <add Name="TIFF File Format;Color compression" Value="LZW RGB0"/>

      <add Name="TIFF File Format;Indexed compression" Value="LZW"/>

      <add Name="TIFF File Format;Greyscale compression" Value="LZW"/>

      <add Name="JPEG File Format;Color compression" Value="Medium Quality"/>

      <add Name="JPEG File Format;Greyscale compression" Value="High Quality"/>

 

    </Settings>
  </WatchFolder>

 
</WatchFolders>