DetectEncodingAndLanguage
 
Description

Automatically detects the encoding and language of a document from the content of the specified ImportFieldOpParam<N> field. The detected language and encoding are placed in the field specified in ImportFieldOpApplyTo<N>.

Example

ImportFieldOpApplyTo0=LangType
ImportFieldOp0=DetectEncodingAndLanguage
ImportFieldOpParam0=DRECONTENT

In this example, the Import Module uses the contents of the document’s DRECONTENT field to automatically detect the document’s language and encoding. In the output, the results appear in the LangType field as a single value, for example ENGLISHUTF8.

In addition, the encoding and language types are listed separately in the ImportDetectedEncoding and ImportDetectedLanguage fields respectively.

In an XML output file the results would appear as follows:

<LangType>ENGLISHUTF8</LangType>
<ImportDetectedEncoding>UTF8</ImportDetectedEncoding>
<ImportDetectedLanguage>ENGLISH</ImportDetectedLanguage>