ProperNames

The method to use to handle proper names when indexing.

In IDOL, proper name terms are pairs of adjacent words that both begin with a capital letter followed by lower case letters, such as Alex Smith or Dragon Restaurant. Depending on the setting that you use, you can also include adjacent pairs of terms that include stop words, such as The Who, or The Queen.

By default, IDOL Content Component stems individual terms and discards stop word terms, regardless of capitalization. Use the ProperNames parameter if you want to store additional terms where capitalized terms occur together. In these cases, IDOL Content Component compounds the two adjacent terms and indexes the combined unit as a term. For example, Alex Smith is compounded to ALEXSMITH. For longer proper name strings, each pair of terms is compounded separately.

This option can improve the relevance of results when you search for the proper name terms. For example, a search for George Washington might return documents that contains the phrase George Bush in Washington, D.C. Indexing proper names terms might improve relevance for documents that include the exact name George Washington. For more information about when to use the ProperNames parameter, refer to IDOL Expert.

This parameter accepts the following values:

0 Do not store proper name terms.
1 Adjacent capitalized terms are compounded, then stemmed and indexed as a unit. For example, Sam James is indexed as SAMJAM.
2

Adjacent terms are compounded (regardless of capitalization), then stemmed and indexed as a unit. For example, bottlenose dolphins is indexed as BOTTLENOSEDOLPHIN.

NOTE:

This setting considerably increases the number of terms in the IDOL Content Component index, which can slow down its performance.

The following ProperNames options allow you to query for proper name terms that include stop words (for example, The Who, or The Queen).

NOTE:

The following settings affect only stop words that start with a capital letter. In all cases, the indexing of individual stop word terms is controlled by the StopWordIndex configuration parameter.

3

Adjacent capitalized stop words are compounded, then stemmed and indexed as a unit. For example, And His is indexed as ANDHI.

Adjacent capitalized terms are compounded, then stemmed and indexed as a unit. For example, Sam James is indexed as SAMJAM.

Capitalized stop words adjacent to capitalized terms are treated as individual terms. For example, The Queen is treated as THE and QUEEN, according to your stop word rules.

4

Capitalized stop words are compounded with adjacent capitalized terms, then stemmed and indexed as a unit. For example, The Bells is indexed as THEBEL, and Calling Will is indexed as CALLINGWIL.

Adjacent capitalized stop words are compounded, then stemmed and indexed as a unit.

Adjacent capitalized terms are compounded, then stemmed and indexed as a unit.

5

Adjacent capitalized stop words are compounded and indexed unstemmed as a unit. For example, And His is indexed as ANDHIS.

Adjacent capitalized terms are compounded and indexed unstemmed as a unit. For example, Sam James is indexed as SAMJAMES

Capitalized stop words adjacent to capitalized terms are treated as individual terms.

6

Capitalized stop words are compounded with adjacent capitalized terms, and indexed unstemmed as a unit. For example, The Bells is indexed as THEBELLS, and Calling Will is indexed as CALLINGWILL.

Adjacent capitalized stop words are compounded and indexed unstemmed as a unit.

Adjacent capitalized terms are compounded and indexed unstemmed as a unit.

7

Capitalized stop words are compounded with adjacent capitalized terms, and indexed unstemmed as a unit.

Adjacent capitalized stop words are compounded and indexed unstemmed as a unit.

Adjacent capitalized terms are treated as individual terms. For example, Sam James is indexed as SAM and JAME.

The following table shows the terms that IDOL Content Component stores for each ProperNames setting for the sentence Tom Jones And His greatest hits:

Original Tom   Jones   And   His greatest   hits
0 TOM   JONE         GREAT   HIT
1 TOM TOMJON JONE         GREAT   HIT
2 TOM TOMJON JONE         GREAT GREATESTHIT HIT
3 TOM TOMJON JONE     ANDHI   GREAT   HIT
4 TOM TOMJON JONE JONESAND   ANDHI   GREAT   HIT
5 TOM TOMJONES JONE     ANDHIS   GREAT   HIT
6 TOM TOMJONES JONE JONESAND   ANDHIS   GREAT   HIT
7 TOM   JONE JONESAND   ANDHIS   GREAT   HIT
Type: Long
Default: 0
Required: No
Configuration Section: LanguageTypes or MyLanguage
Example: ProperNames=1
See Also: AdvancedSearch
StopWordIndex
NOTE:

If you change this setting after you have indexed content into IDOL Server, the new setting applies only to new content, and the server logs a warning. To clear the warning and ensure that your change applies to all your content, you must initialize your index and reindex the content.


_HP_HTML5_bannerTitle.htm