Starting with Document Conversion Service 3.0.009, the Watch Folder Service includes the ability to extract and convert any attachments in Outlook Message files (*.msg) as well converting the Outlook message file itself.
In Document Conversion Service 3.0.029, support for Electronic Mail messages (*.eml) files with attachments was added. EML is the standard file extension for email files stored using the Internet Message Format protocol. This format complies with the RFC 5322 industry standard. They can be single messages, or include attachments.
When this option is enabled, the file is checked for attachments and if any are found, the original message file and all of its attachments are converted. The initial settings have the resulting files placed into the OutputFolder under a sub folder of the same name as the original file. If any attachments are not of a file type supported by Document Conversion Service, the attachment will not be converted and is placed in the Failed folder.
The original message and all attached files and embedded images in the e-mail and signature are processed. This includes recursively processing attachments that are Outlook Messages or EML files that themselves have attachments. All message content and file attachments are extracted into a sub folder of the same name as the original file. If any name collisions are detected, the file names are made to be unique by adding a number in brackets at the end. As an example, an email message with an attached PDF document named lorem.pdf and an attached message that also has an attached PDF document of the same name will create two files - lorem.pdf.tif and lorem(2).pdf.tif.
The sample message below, Test Email With Attachments.msg, contains a single attached PDF file, as well as 5 small images from the signature. When the MSG is processed, the original message file and all attachments are processed. The attached PDF file will retain its name, and the inline images that are part of the signature will be named image001 through to image005.
When processing the above message, the option to keep the original filename's extension as part of the new filename was enabled. This can be disabled using the setting <add Name ="Save;Remove filename extension" Value ="1"/>. When this option is disabled, the output file from a file named lorem.pdf would become lorem.tif instead of lorem.pdf.tif.
Several settings have been added to control email message file attachment processing. Each of the included pre-configured WatchFolders already have these new settings added with the attachment processing disabled. To enable attachment processing, simply uncomment the setting PreprocessArchiveFormatsFilter. To disable it, you can comment it out again, or set it as an empty string.
PreprocessArchive.IncludeExtensionInFolderName
Allows you to control whether or not the .msg or .eml file extension is included in the name used to created the subfolder that will hold the message and attachments for processing. In the screenshot above, the .msg file extension was kept as part of the subfolder name. To minimize possible name collision, we recommend leaving this option enabled.
PreprocessArchive.CreateAllOutputInSubfolder
Added in version 3.0.025, this option controls if the MSG or EML file and extracted attachments will be stored in a subfolder, or at the root of the output folder. To minimize possible name collision, we recommend leaving this option enabled unless you are certain of unique filenames, or are using Unique File Naming and Flat Folder Structures and the MSG and EML Unique File Naming Options below.
The next three settings are specific to handling, or filtering what message file attachments actually get converted. They apply to both MSG and EML files.
The first setting determines if inline attachments are converted, and the second two settings allow for further filtering of what e-mail attachments will be processed. These filtering options are applied in the order of inline attachments, include filter and then finally exclude filter.
Most often only one of the include or exclude filter will be used at a time, depending on how you need to filter. It is easier to say exclude only "*.jpg" attachments , or include only "*.pdf" attachments than to write long, specific lists of all of the file types.
PreprocessArchive.MSG.IncludeInlineAttachments
Message attachments can be inline (pasted into the email body) or attached as separate files. Images used in signatures are often inline attachments, while a PDF file attached to the email would not be. You can disable the processing of all inline attachments by setting this value to false. As some inline attachments can actually be documents, setting this to False is not recommended. This setting is always checked first before the message attachment filtering settings below.
PreprocessArchive.MSG.AttachmentsIncludeFilter
Allows for filtering of what attachments will be processed. When set to an empty string, all attachments are processed. To filter for specific file types, enter in the extensions for each type separated by the pipe (|) character. For example, to only convert any attached Word and PDF documents, you could set this as <add Name="PreprocessArchive.MSG.AttachmentsIncludeFilter" Value="*.doc|*.docx|*.pdf" />. This setting is always applied after the inline attachment check above and before the exclude filter check below.
PreprocessArchive.MSG.AttachmentsExcludeFilter
The last filtering setting, and also the setting applied last, is the exclude filter, which determines what files (by extension) to not extract from the MSG. As with the include filter above, enter in the extensions for each file type you do not want to be converted, separated by the pipe (|) character. When left as an empty string, no files are excluded.
Image attachments are converted to the new format using the source image's resolution and not the requested output format resolution. This applies when converting to image as well as PDF files. Up-scaling images to a higher resolution can result in images that are many times larger than the actual source image. Other factors such as going from JPG to a lossless format like TIFF can also cause an increase in file size.
PreprocessArchive.MSG.ImageAttachmentsKeepSourceResolution
This defaults to true. Conversion options such as fax mode and other image option actions can override this. This setting overrides the ConverterPlugIn.PNImageConverter.KeepSourceImageResolution setting and applies only to images extracted from an MSG or EML. Setting this to false may cause images to be very large.
Controlling Image Size With Compression
Another way to control the size of extracted and converted images is by setting the compression option for the output format. As an example,converting a JPG image such as a photograph from a camera to a TIFF image using the default settings of LZW compression will create a very large file.
To create a comparable TIFF image, we need to change the compression to one of the JPEG compression options for TIFF.
Code Sample - Controlling Image Size with Compression |
|
<WatchFolders> ...
|
Added in version 3.0.025, these options add the ability to apply unique names for all MSG and EML and attachments as they are processed when using the existing Unique File Naming and Flat Folder Structures settings. If either or both of OutputFolder.PrependUniqueGUIDToFilename or OutputFolder.AppendUniqueGUIDToFilename are true, a Globally Unique ID, or GUID is added to the output filename for the MSG and any extracted attachments.
The default behavior is to use the same GUID in both the MSG subfolder (if using) and in all extracted and converted attachments.
PreprocessArchive.MSG.UseUniqueGUIDInMSGFolderName
Controls if a GUID is used in the MSG or EML folder name created to store the converted MSG or EML and attachments. Applies when PreprocessArchive.CreateAllOutputInSubfolder is true.
PreprocessArchive.MSG.UseSameGUIDForAllFiles
The default behavior is to use the same GUID in the folder and for all files extracted and converted into that folder. To use a random, unique GUID for each file, set this to false.
PreprocessArchive.MSG.UseMessageIDForPrependGUID
PreprocessArchive.MSG.UseMessageIDForApppendGUID
When enabled, any Message ID included in the source email header information is used in place of a randomly generated GUID when building the output folder and email and attachment filenames. If enabled and there is no Message ID in the email, the GUID is used instead.
PreprocessArchive.MSG.UseMessageIDWithoutFQDN
An email Message Id consists of a string of characters which is the unique identifier of this message from the mail server, and ends with ampersand (@) and the Fully Qualified Domain Name (FQDN) of the mail server that sent the message. By default, we set this option to true to only use the first part of the string.
The last set of highlighted settings shown below are not in the included in the pre-configured WatchFolder settings. These are some recommended settings to help control the size of the final output files when dealing with Outlook Messages with attached images and logos in the signatures.
Code Sample - Default Outlook Message Processing |
|
<WatchFolders> ... <add Name="PreprocessArchiveFormatsFilter" Value=".msg|*.eml" /> <!-- the chance of archive and folder name collision. --> <!-- and folder name collision. Can be used in with OutputFolder.PrependUniqueGUIDToFilename or --> <!-- OutputFolder.AppendUniqueGUIDToFilename to flatten the structure and create unique names. --> <add Name="PreprocessArchive.CreateAllOutputInSubfolder" Value="true" /> <!-- processing message attachments. Pass empty string for match all. Runs after --> <!-- inline attachment check above, precedes exclusion check below.--> <!-- processing message attachments. Pass empty string to exclude none. --> <!-- When converting image attachments to images, keep the new image's resolution the --> <!-- same as source image. Fax mode and other image option actions can override this. --> <!-- This setting overrides ConverterPlugIn.PNImageConverter.KeepSourceImageResolution -->
<!-- The following settings are only used if OutputFolder.PrependUniqueGUIDToFilename --> <!-- or OutputFolder.AppendUniqueGUIDToFilename are set to true. --> <!-- When below is set to true, all files extracted from an MSG will have the same GUID. --> <add Name="PreprocessArchive.MSG.UseSameGUIDForAllFiles" Value="true" /> <!-- When set to true, the folder used to store the msg and its attachments will --> <!-- be formatted with the pre-post GUID strings as set. --> <!-- If PreprocessArchive.MSG.UseSameGUIDForAllFiles is also true, the GUID --> <!-- in the folder name will match the files underneath. --> <add Name="PreprocessArchive.MSG.UseUniqueGUIDInMSGFolderName" Value="true" /> <!-- When set to true, all files extracted from an MSG will use the ID, including --> <!-- the Fully Qualified Domain Name (FQDN) --> <add Name="PreprocessArchive.MSG.UseMessageIDForPrependGUID" Value="true" /> <add Name="PreprocessArchive.MSGUseMessageIDForAppendGUID" Value="true" /> <!-- Set this to true to only use the first part of the Message ID, --> <!-- dropping the @FQDN part. --> <add Name="PreprocessArchive.MSG.UseMessageIDWithoutFQDN" Value="true" />
<!-- Keep image resolution the same as source. Applies to all images--> <add Name="ConverterPlugIn.PNImageConverter.KeepSourceImageResolution" Value="True"/>
<!-- Output file options -->
|