Open topic with navigation
If your HPE IDOL Server license includes automatic language detection, HPE IDOL Server can automatically identify the language and encoding of a document when it is indexed. HPE IDOL Server analyzes a certain amount of text in the document content fields (fields for which
SourceType is set to
True in the HPE IDOL Server configuration file).
Open the HPE IDOL Server configuration file in a text editor.
[Server] section and add this setting:
True if you do not want to index documents whose language HPE IDOL cannot recognize. For example, it might not recognize the language because the document does not contain language, or it might not have enough text for HPE IDOL Server to determine the language.
By default HPE IDOL Server indexes the document using the default language type. It also logs a warning message in the index log, so that you can add an appropriate language type.
You can change the amount of text that HPE IDOL Server analyzes to detect the language of a document. By default, HPE IDOL Server uses only a few sentences. In some situations, increasing the amount of text to analyze can give more accurate results, such as when significant amounts of a minor second language are present.
By default, HPE IDOL Server detects any 7-bit ASCII characters as UTF-8. If you instead want to group these documents with documents using 8-bit ASCII, disable the
LangDetectUTF8 parameter by setting it to
Ensure that the encoding option you want is present in the language type configuration (see Define Language Types). If there are no compatible encodings configured for the detected language, IDOL assigns the default language type.
Save and close the configuration file. Restart HPE IDOL Server for your changes to take effect.
If you enable automatic language detection and set up a field process that reads the language of a document from one of its fields, HPE IDOL Server uses the field process rather than autodetection to determine the document language and encoding.