Set TermAnalysis
to True
to return a summary of the counts of terms in different classes. It returns the following information:
Terms. The total number of terms.
Numeric. The number of purely numeric terms.
Alphanumeric. The number of alphanumeric terms (this count excludes purely numeric terms).
Multibyte. The number of terms that include at least one multibyte character.
Dococcs logn. The number of terms that contain the associated number of document occurrences.
Length len. The number of terms of each length.
DistinctTermsPerDoc logn. The number of documents that contain the associated number of distinct terms.
TermsPerDoc logn. The number of documents that contain the associated number of terms.
Logn=N
means log (base 2) of N
. For example:
Logn=0 means items that have 1 (2^{0}) of this property (for example, documents with only 1 distinct term).
Logn=1 means items that have 2 (2^{1}) of this property.
Logn=2 means items that have 34 (between 2^{1} and 2^{2}) of this property.
Logn=3 means items that have 58 (between 2^{2} and 2^{3}) of this property.
Actions:  TermGetAll 
Type:  Boolean 
Default:  False 
Example:  TermAnalysis=True

See Also: 
