Store Content in IDOL Server > Language Support > Enable Automatic Language Detection

Enable Automatic Language Detection
If your IDOL server license includes Automatic Language Detection, IDOL server can automatically identify the language and encoding of a document when it is indexed. IDOL server analyzes a certain amount of text in the document content fields (fields for which SourceType=true in the IDOL server configuration file).
To enable Automatic Language Detection
1.
2.
Find the [Server] section and add this setting:
AutoDetectLanguagesAtIndex=true
3.
Set DiscardUnconfiguredLanguagesAtIndex to true if you do not want to index documents with a language type that is not configured.
Set DiscardUnknownLanguagesAtIndex to true if you do not want to index documents whose language IDOL cannot recognize. For example, it may not recognize the language because the document does not contain language, or it may not have enough text for IDOL server to determine the language.
By default IDOL server indexes the document using the default language type. It also logs a warning message in the index log, so that you can add an appropriate language type.
4.
You can change the amount of text that IDOL server analyzes to detect the language of a document. By default, IDOL server uses only a few sentences. In some situations increasing the amount of text to analyze can give more accurate results, such as when significant amounts of a minor second language are present.
Add the MaxLanguageDetectTerms setting to the [Server] section, specifying the number of terms (words) that it uses for detection. For example:
MaxLanguageDetectTerms=1000
5.
Set the LangDetectUTF8 parameter to true if you want files that contain 7-bit ASCII to be detected as UTF-8, rather than ASCII.
Automatic Language Detection uses the Index fields to detect languages by default. If these fields contain only 7-bit ASCII, it detects the document as ASCII. If additional fields in the document contain UTF-8, IDOL may convert them incorrectly. If you know that documents are generally in UTF-8, then set the LangDetectUTF8 parameter to true in the [Server] section.
6.
7.
 
NOTE If you enable Automatic Language Detection and set up a field process that reads a document's language from one of its fields, IDOL server uses the field process rather than autodetection to determine the document language and encoding.