Prevent Duplicate Documents

You can configure the IDOL Content component to implement deduplication when indexing documents. This process prevents storage of the same document or document content. If Content determines that the document to index matches an existing document, it replaces the existing document with the new document.

The IDOL Content component uses deduplication options to determine whether documents match. See Deduplication Options—KillDuplicates.

You can enable deduplication in one of three ways:

Some other IDOL Content component parameters affect the behavior of the deduplication settings. See Deduplication Constraints.

You can deduplicate after indexing by using the DREDUPLICATE index action. See Locate Duplicate Documents.