Generate Snapshots

To cluster information, take a snapshot of data in your IDOL data index. You can then automatically cluster the data in this snapshot (you do not need to set up an initial taxonomy).

The IDOL Category component takes a snapshot of the data in your IDOL Content component and, based on these snapshots, clusters related information together. Each cluster represents a concept area that contains a set of items, which share common properties.

The ClusterSnapshot action allows you to take a snapshot of the data stored in the IDOL Content component data index. By default this includes data for the Content databases News and Archive. A snapshot represents the content of the data index at a particular time, and enables you to generate cluster information and spectrographs at a later point, even if the data index has changed. You can use a single snapshot to generate both cluster information and spectrograph data to save processing time.

The action adds a timestamp to each snapshot (with the AUTNDATE) and stores it in binary .cls format in the Snapshots subdirectory of the Cluster directory in your IDOL Category component installation directory. This process allows you to have several snapshots with the same name (for example, of one particular IDOL Content component) and snapshots with different names (for example, of different data sets).

The results of ClusterSnapshot are saved as a named snapshot job. You can specify that job name when taking other actions on the snapshot data. You can also set up a schedule that runs the ClusterSnapshot action at regular intervals.


The Content data index that you take a snapshot of must ideally contain at least several thousand documents with good quality content (that is, relevant text for various topics).