TestFilter

The TestFilter code demonstrates most of the Filtering methods available in the .NET API. The command-line options are listed in Options for FilterTestDotNet -ft1.

To run TestFilter, type the following at the command line:

FilterTestDotNet filtermode [options] input_file output_file

where:

filtermode is one of the options listed in Filter modes

options is one or more of the options listed in Options for FilterTestDotNet -ft1. Options are available for the -ft1 filter mode only.

input_file is the path and file name of the source file.

output_file is the path and file name of the generated file. If you do not specify a path, the file is output to the current directory.

Filter modes

Mode Description
-ft1 Filters an input file to an output file.
-ft2 Filters an input stream to an output file.
-ft3 Filters an input file to an output stream.
-ft4 Filters an input stream to an output stream.

 

Options for FilterTestDotNet -ft1

Option Description
-co ooperrorlog Enable error logging. See Enable or Disable Error Logging. Error logs are not generated when in-process filtering is enabled.
-cs charset

Set the character set of the source file.

charset is a character set defined in the Filter class. See Coded Character Sets.

-ct tempfile

Specify a temporary directory where temporary files generated by the filtering process are stored. The default is the current working directory.

On Windows systems, there is a 64 K size limit to the temporary directory. When the limit is reached, you must either create a new directory or delete the contents of the existing directory; otherwise, you might receive an error message.

-cx xmlconfigfile Filter an XML file by using customized extraction settings defined in the kvxconfig.ini file. If you do not enter the full path to the INI file, the program looks for the file in the current working directory. See Filter XML Files.
-d Extract the file format information.
-dr binDir Specify the filter working directory where KeyView binaries are stored. Typically, this is the bin directory.
-fto timeout Specifies a Filter timeout value in seconds.
-h Extract headers and footers, as well as the body text.
-ht Put tags around header and footer data.
-i filename

Extract the metadata (summary information) and write it to a file.

filename is the name of the file to which the metadata is written. See Extract Metadata.

-ia summaryfile Extract the document summary information and write it to a summary file, including all metadata for the pdfsr reader.
-im If you set this option, text that was deleted from a document with revision tracking enabled is extracted from the document and included in the filtered output. See Extract Deleted Text Marked by Tracked Changes.
-ip Run Filter in the same process as the calling application (in process). See Run Filter In Process.
-lo Specify that PowerPoint PPT97 and PPTX file text data is output in a logical reading order.
-ne Exclude embedded objects in Microsoft Word files.
-pdfauto

The PDF filter determines the paragraph direction (left-to-right or right-to-left) for each PDF page, and then sets the direction accordingly.

See Filter PDF Files.

-pdfltr Specify that PDF files are output in a logical reading order in left-to-right paragraph direction.
-pdfrtl Specify that PDF files are output in a logical reading order in right-to-left paragraph direction.
-rc character Set a replacement character for characters that cannot be mapped. The default is a question mark (?).
-tc charset

Set the character set of the output file. Use the -getTargetCS option to determine whether the target character set specified is used in the output file.

charset is a character set defined in the Filter class. See Coded Character Sets.

-um Use MSBLSB byte order. MSBLSB is the “Most Significant Byte Least Significant Byte,” or in other words, the byte order for Big Endian systems (Unicode text only).
-ul Use LSBMSB byte order. LSBMSB is the “Least Significant Byte Most Significant Byte,” or in other words, the byte order for Little Endian systems (Unicode text only).
-ulb Generate LSBMSB output with byte order marker (Unicode text only).
-umb Generate MSBLSB output with byte order marker (Unicode text only).
-embeddedfont If you use this option, text that contains embedded fonts is not filtered from PDF documents. See Filter PDF Files.

_HP_HTML5_bannerTitle.htm