Store Content in IDOL Server > Language Support > Create a Custom Stem File for a Language

Create a Custom Stem File for a Language
You can override the default stemming rules for certain words in a given language by creating a language-specific stemming file. This file is a list of words and their stems. If a stemming file exists, the terms that it contains
If, in a given language, you wish to override the default stemming rules for certain words, you can do so by creating a language-specific stemming file, which is a list of words and their stems. If a stemming file exists, IDOL server stems the terms that it contains according to the file. Terms that are not in the file stem according to the default stemming rules.
Autonomy recommends use of a stemming file only for unusual or specialized terms where the default rules do not generate a stem. A stemming file is not intended to be a complete replacement for the IDOL stemming algorithms.
To set up a stemming file
1.
2.
Format the file as a stop word list. The first line is an encoding designation. Subsequent lines contain individual word pairs; a term followed by its stem. For example:
[UTF8]
mice mouse
mouse mouse
children child
 
NOTE To ensure that two words stem to the same value, you must add both words to the stemming file, with the appropriate stem.
3.
Save the file with a name of your choice (for example, english_stem.dat) in the directory installDir/idol/IDOLServer/serviceAlias/langfiles.
4.
Open the IDOL server configuration file. In the [MyLanguage] section for the stemming file language, set the StemmingFile configuration parameter to the name of your stemming file. For example:
[english]
Encodings=ASCII:englishASCII,UTF8:englishUTF8
Stoplist=engish.dat
StemmingFile=english_stem.dat
5.
Ensure this [MyLanguage] section does not set Stemming to false. The default value for Stemming in a language is true.
If you disable stemming for a language, but provide a stemming file, IDOL server stems terms in the file, but does not stem other terms.