Streaming a Document

Your gateway driver sets up a stream and then uses that stream to return the contents of a repository document. You must provide tokens to the Verity engine to set up the document, to deliver its contents, and to inform the Verity engine that the streaming operation is complete. For information on the use of tokens, see Creating Virtual Documents. Typically the mapping between the document content and tokens is defined in the gateway configuration file. For more information; see Gateway Configuration File.

The following sections describe additional aspects related to streaming a document:

Bypassing Auto-Recognition

 

Differentiating Streams

 

For additional tips about implementing streams, see Performance Improvement Techniques.

Bypassing Auto-Recognition

The Verity engine provides an auto-recognition feature that attempts to determine the format of the data in a repository and the character set in which the data is encoded. Your gateway driver can send this information to the Verity engine directly, which saves processing time. To bypass the Verity engine’s auto-recognition feature, your gateway driver’s VgwStreamGetTokenFnc function (see Stream Interface) is responsible for returning a repository document’s contents as tokens. It releases one token each time it is called. The first two times the Verity engine calls your VgwStreamGetTokenFnc function, the function should return the following two tokens, in order:

1. Content-type token (VdkTokenType_ContentType), which specifies the kind of content, which is one of the following types:

 


Type of Content

Description

VdkTokenCT_Text

Textual content

VdkTokenCT_Application

Application-specific content


If your content is textual, you must specify the kind of text, as follows:

 


Type of Text

Description

VdkTokenCTText_Plain

Plain text

VdkTokenCTText_HTML

HTML

VdkTokenCTText_SGML

SGML

VdkTokenCTText_XML

XML

VdkTokenCTText_Richtext

Rich text

VdkTokenCTText_TabSep

Tab-separated text

VdkTokenCTText_Empty

X-empty


2. Field token (VdkTokenType_Field) for the internal Charset field, which identifies the character set of the stream; for example, cp1252, 1252, windows-1252, or 8859 for English. For a list of supported character sets, see Charset Name Mapping Table.

Differentiating Streams

Your VgwStreamGetTokenFnc function can send different tokens for the same document; it need not provide the same token stream for the same document. You can decide the tokens to send by the stream’s context; for example, if the document is only being viewed, you may be able to use buffer tokens (VdkTokenType_Buffer) to send the data and not send zone tokens or other indexing-related tokens. You can determine the way the stream is being used by checking the status of the VgwAccessMode_Index flag in the flags member of the VgwStreamNewArgRec structure, which is available to your gateway driver’s VgwStreamNewFnc function.