DREDUPLICATE

Removes or tags duplicates after indexing.

This index action runs on a specified subset of the content, locating duplicates using a variety of methods. Any duplicates can then be deleted, moved to a different database, or tagged within a specified field, depending on the value of DuplicateAction that is chosen.

Example

http://12.3.4.56:20001/DREDUPLICATE?DuplicateAction=Delete&ReferenceField=*/DREREFERENCE

In this example, duplicates are identified using the DREREFERENCE field, and any duplicates found are deleted.

Parameters

Parameter Description Required
ChecksumField A reference field used to determine whether a match is exact.  
Database The database to move duplicates to. see Comments
DatabaseMatch A list of databases to search for duplicates in.  
DuplicateAction The action to perform on duplicates. Yes
MaxID The last DocID to find duplicates of.  
MinID The first DocID to find duplicates of.  
ReferenceField A reference field to use as the initial determination of whether two documents are a match. Yes
TagField The field to tag duplicates with. see Comments
TagValue The static value to tag duplicates with in the TagField.  
ThreadHashField The field containing the thread hash values used to determine whether a match is a duplicate.  

This index action accepts the following standard index action parameters.

Parameter Description
IgnoreMaxPendingItems Whether to ignore the IndexQueueMaxPendingItems limit for this index action.
IndexUID An identification code for any document tracking events.
NoArchive Turn off configured archiving for the index action.
Priority The priority for the index job.

Comments


_HP_HTML5_bannerTitle.htm