Open topic with navigation
You can override the default stemming rules for certain words in a particular language by creating a language-specific stemming file. This file is a list of words and their stems. If a stemming file exists, the IDOL Content component uses it to stem the terms that it contains. Terms that are not in the file stem according to the default stemming rules.
HPE recommends use of a stemming file only for unusual or specialized terms where the default rules do not generate a stem. A stemming file is not intended to be a complete replacement for the IDOL stemming algorithms.
Create a text file.
Format the file as a stop word list. The first line is an encoding designation. Subsequent lines contain individual word pairs; a term followed by its stem. For example:
[UTF8] mice mouse mouse mouse children child
The terms and stems can contain only alphanumeric characters.
To ensure that two words stem to the same value, you must add both words to the stemming file, with the appropriate stem.
Save the file with a name of your choice (for example,
english_stem.dat) in the directory
Open the IDOL Content component configuration file in a text editor.
[english] Encodings=UTF8:englishUTF8 Stoplist=engish.dat StemmingFile=english_stem.dat
Ensure that this
[MyLanguage] section does not set
False. The default value for
Stemming in a language is
If you disable stemming for a language, but provide a stemming file, Content stems terms in the file, but does not stem other terms.