Open topic with navigation
You can use reference-based indexing to distribute the indexing load evenly between IDOL Servers and achieve efficient data distribution. Data indexing depends on the reference of the documents. In simple non-mirror mode, DIH sends all actions to all servers, and instructs the child servers to determine which documents to index. With reference-based indexing, DIH performs these calculations, which reduces network traffic and load on the child servers.
When enabled, reference-based indexing applies to the
DREREPLACE index actions. You must configure the
DREREFERENCE field by using standard field processing settings.
When you use reference-based indexing, you cannot alter the number of child servers.
In a chained DIH setup, the DIHs might distribute documents unevenly if more than one level of the chain uses reference-based indexing. To prevent uneven distribution, the number of child servers at each level must be coprime (that is, they have no common numerical factors).
For example, if you have a parent DIH with two DIH child servers, each of which has four IDOL Server children, documents are not distributed evenly in distribute-by-reference mode. The parent server splits the data into two using a checksum hash of the document reference. The first child server uses the same algorithm to distribute data to its four child servers. Because it has only the first half of the data, only two child servers receive data.
However, if you have a parent DIH with two child servers, each of which has three IDOL Server children, data is distributed evenly in distribute-by-reference mode. The parent server splits the data into two. The first child server then splits the data into three, so that all child servers receive data.
Reference-based indexing might prevent deduplication of documents with different references. You can use reference-based indexing only with a
KillDuplicates=NONE setting in the
[Server] section of the IDOL Server configuration file.
To enable reference-based indexing, set
True in the
[Server] section of the DIH configuration file.
[Server] Port=16000 DIHPort=16001 MirrorMode=False DistributeByReference=True
You can use
DistributeByReference only when
MirrorMode is set to
False. DIH will not start if
MirrorMode are both set to
You must also configure standard field processing options to specify the reference field to use to distribute documents. For more information, refer to the IDOL Server Administration Guide.