NGramMultiByteOnly

Whether to tokenize all strings into N-grams, or only multi-byte strings. Use this parameter in combination with the NGram parameter, which determines the size of character N-grams.

For example, if you set NGramMultiByteOnly to True, if a document that contains both English and Asian text, HPE Content Component tokenizes the Asian text into N-grams according to the NGram setting. It does not tokenize the English text.

Type: Boolean
Default: False
Required: No
Configuration Section: LanguageTypes or MyLanguage
Example:
Ngram=2
NgramMultiByteOnly=True 
See Also: NGram

_HP_HTML5_bannerTitle.htm