What is a Topic?

A topic is a stored query expression written in the Verity Query Language (VQL). A topic models a concept of interest, which is used as the definition for a category. When a topic is evaluated against a set of documents, the Verity search engine identifies the subset of the documents that match the concept that the topic represents.

Consider the following scenario:

You can use the VQL expression GM <OR> Ford <OR> Chrysler to model the concept “North American car manufacturers.” When this expression is evaluated on a set of newspaper articles, the Verity search engine selects all articles that mention GM, Ford, or Chrysler as matching the concept “North American car manufacturers.”


Topics can be combined using VQL operators to create more complex topic definitions. For example, you might combine the concept “North American car manufacturers” with “European car manufacturers” (another VQL expression). By combining these topics and applying <NOT> to the concepts, you could perhaps create a new topic definition corresponding to the concept “Asian car manufacturers.” (This definition assumes no South American or Australian car manufacturers.)


You can also use sophisticated non-Boolean VQL operators.



Note   In a realistic topic, you would use more powerful VQL operators, such as the <ACCRUE> operator instead of <OR>. VQL also has a role in advanced searching; however, a discussion of VQL is beyond the scope of this book. For information about VQL, see the Verity Query Language and Topic Guide.


Using individual topics or combining topics you can create category definition rules that are used to decide whether a document belongs to the category. There are several techniques for constructing topics, ranging from domain expertise to the use of automated machine learning techniques. Topics can be combined regardless of how they have been created. One advantage of combining topics is that it allows a gradual build-up in such a way that basic topics can be shared between multiple higher-level topics.