KVXConfigInfo

This structure defines an XML document type and the element extraction settings for that type. You can apply the settings based on the file format ID, or the root element of the file. This structure is in kvtypes.h.

typedef struct TAG_KVXConfigInfo
{
    ENdocFmt    eKVFormat;
    char*       pszRoot;
    char*       pszInMeta;
    char*       pszExMeta;
    char*       pszInContent;
    char*       pszExContent;
    char*       pszInAttribute;
}
KVXConfigInfo;

Member Descriptions

eKVFormat

The format ID as detected by the KeyView detection module. This determines the file type to which these extraction settings apply. The format ID is defined by the enumerated type ENdocFmt. See File Format Detection for more information on format ID values.

If you are adding configuration settings for a custom XML document type, this is not defined.

pszRoot

The root element of the file. If the format ID is not defined, the root element is used to determine the file type to which these settings apply.

To further qualify the element, specify its namespace. See Specify an Element’s Namespace and Attribute.

pszInMeta

The elements extracted from the file as metadata. All other elements are extracted as text. Separate multiple entries with commas.

To further qualify the element, specify its namespace, its attributes, or both. See Specify an Element’s Namespace and Attribute.

pszExMeta

The child elements in the included metadata elements that are not extracted from the file as metadata. For example, the default extraction settings for the Visio XML format extract the DocumentProperties element as metadata. This element includes child elements such as Title, Subject, Author, Description, and so on. However, the child element PreviewPicture is defined in pszExMeta because it is binary data and should not be extracted.

You cannot exclude any metadata elements from the output for StarOffice files. All metadata is extracted regardless of this setting.

To further qualify the element, specify its namespace, its attributes, or both. See Specify an Element’s Namespace and Attribute.

pszInContent

The elements extracted from the file as content text. An asterisk (*) extracts all elements including child elements.

To further qualify the element, specify its namespace, its attributes, or both. See Specify an Element’s Namespace and Attribute.

pszExContent

The child elements in the included content elements that are not extracted from the file as content text.

To further qualify the element, specify its namespace, its attributes, or both. See Specify an Element’s Namespace and Attribute.

pszInAttribute

The attribute values extracted from the file. If attributes are not defined, attribute values are not extracted. You must define the namespace (if used), element name, and attribute name in the following format:

namespace:elementname@attributename

For example:

hpe:division@name


_HP_HTML5_bannerTitle.htm