Term Separators

The IDOL Content component automatically generates separators for each language to determine where one term ends and another begins. These include characters such as spaces, tabs, carriage returns, and line feeds.

To ensure that Content uses a character as a separator, specify it in the AugmentSeparators configuration parameter. Content replaces all separator characters with a space.

For example, the following table describes the query matching for when AugmentSeparators=,-.

Indexed string Query terms matched
second-hand guitar
  • second

  • hand

  • guitar

NOTE:

The hyphen is a separator only if it is not listed in HyphenChars, because HyphenChars takes precedence over separators.

To ensure that Content does not use a character as a separator, specify it in the DiminishSeparators configuration parameter. Content removes nonseparators at index time.

For example, the following table describes the query matching for when DiminishSeparators=_%.

Indexed string Query terms matched
file_name filename

To ensure that Content indexes a character as its own token, specify it in the SoftSeparators configuration parameter.

For example, the following table describes the query matching for when SoftSeparators=1234567890.

Indexed string Query terms matched
459
  • 4

  • 5

  • 9

  • 45

  • 59

  • 459

In this example, Content tokenizes all numbers as single digits, so that 459 is indexed as 4 5 9.


_HP_HTML5_bannerTitle.htm