Administration and Maintenance > Administer IDOL Server > Locate Duplicate Documents

Locate Duplicate Documents
You can locate duplicate documents in the data index after indexing has taken place by using the DREDUPLICATE index action. This action locates duplicates in a specified subset of the content, and then removes, tags a field, or moves the duplicate documents to another database.
is the IDOL server index port (specified as IndexPort in the IDOL server configuration file [Server] section).
is a ReferenceType field used as the initial determination of whether two documents are a match.
is the action to perform on a duplicate. The following options are available:
Delete. Deletes all duplicate documents.
Database. Moves all duplicate documents to a database. If you select the Database action, you must specify the database in the Database parameter.
Tag. Tags a specified field in the duplicate document. You must specify the field in the TagField index parameter. You can also specify a value to tag the field with by using the TagValue parameter. If you do not specify a TagValue, the field is tagged with the value 1.
Refer to the IDOL server Online Help for details on other parameters that are available for the DREDUPLICATE action.
For example:
This action uses port 20001 to remove duplicates from the IDOL server that is located on the machine with the host name MyHost. It identifies duplicates using their DREREFERENCE and moves them to the Duplicates database.
In this example, the duplicates are initially identified using the DREREFERENCE field, and then the DRETITLE field is changed to the value Duplicate.
To IDOL server from indexing duplicate documents, use the KillDuplicates parameter with DREADD and DREADDDATA actions.
Related Topics