Reference-Based Indexing

You can use reference-based indexing to distribute the indexing load evenly between IDOL Servers and achieve efficient data distribution. Data indexing depends on the reference of the documents. In simple non-mirror mode, DIH sends all actions to all servers, and instructs the child servers to determine which documents to index. With reference-based indexing, DIH performs these calculations, which reduces network traffic and load on the child servers.

When enabled, reference-based indexing applies to the DREADD, DREADDDATA, and DREREPLACE index actions. You must configure the DREREFERENCE field by using standard field processing settings.

When you use reference-based indexing, you cannot alter the number of child servers.

In a chained DIH setup, the DIHs might distribute documents unevenly if more than one level of the chain uses reference-based indexing. To prevent uneven distribution, the number of child servers at each level must be coprime (that is, they have no common numerical factors).

For example, if you have a parent DIH with two DIH child servers, each of which has four IDOL Server children, documents are not distributed evenly in distribute-by-reference mode. The parent server splits the data into two using a checksum hash of the document reference. The first child server uses the same algorithm to distribute data to its four child servers. Because it has only the first half of the data, only two child servers receive data.

However, if you have a parent DIH with two child servers, each of which has three IDOL Server children, data is distributed evenly in distribute-by-reference mode. The parent server splits the data into two. The first child server then splits the data into three, so that all child servers receive data.

NOTE:

Reference-based indexing might prevent deduplication of documents with different references. You can use reference-based indexing only with a KillDuplicates=REFERENCE or KillDuplicates=NONE setting in the [Server] section of the IDOL Server configuration file.

To enable reference-based indexing, set DistributeByReference to True in the [Server] section of the DIH configuration file.

For example:

[Server]
Port=16000
DIHPort=16001
MirrorMode=False
DistributeByReference=True
NOTE:

You can use DistributeByReference only when MirrorMode is set to False. DIH will not start if DistributeByReference and MirrorMode are both set to True.

You must also configure standard field processing options to specify the reference field to use to distribute documents. For more information, refer to the IDOL Server Administration Guide.


_HP_HTML5_bannerTitle.htm