The following is an example of how the fpFillBuffer()
function in foliosr
could be developed. The example demonstrates how the code changes as limitations of the implementation are identified. With each implementation, code revisions are shown in bold.
/***************************************************************** *Function: fffFillBuffer() *Summary: Read fff input from stream and parse into kvtoken.h codes *****************************************************************/ int pascal _export fffFillBuffer( void *pCFContext, BYTE *pcBuf, UINT *pnBufOut, int *pnPercentDone, UINT cbBufOutMax ) { BOOL bRetVal; TPfffGlobals *pContext = (TPfffGlobals *)pCFContext; pContext->pcBufOut = pcBuf; fffReadSourceFile(pContext); bRetVal = fffProcessBuffer(pContext, pcBuf); *pnPercentDone = (int)(pContext->unTotalBytesProcessed * (UINT)100 / pContext->unFileSize); *pnBufOut = (UINT)(pContext->pcBufOut - pcBuf); return (bRetVal ? KVERR_Success : KVERR_General); }
The parameters in fffFillBuffer()
are as follows:
Parameter | In/Out | Description |
---|---|---|
pCFContext
|
In | A pointer to the context structure of the custom reader. |
pcBuf
|
In/Out | A pointer to the token output buffer. |
pnBufOut
|
Out | A pointer to the number of bytes written to the output buffer. |
pnPercentDone
|
Out | A pointer to the percentage complete. |
cbBufOutMax
|
In | The maximum number of bytes that the token output buffer can hold. |
pContext
is set to the address of the pCFContext
void pointer, cast to a pointer to the global context structure for the reader. This provides access to all members of this structure.pContext
variable, a call is made to read the source file.fffProcessBuffer()
. The second parameter in the call is a pointer to the token output buffer. If this call fails, usually because of memory allocation errors, it returns FALSE
.BYTES
written to the token output buffer is calculated. This is based on the value of pContext->pcBufOut
, which is increased each time a token is written to the buffer. fffFillBuffer()
are made by the structured access layer until the percentage complete is 100.fffProcessBuffer()
generates a token stream larger than this, there is a memory overflow. If fffProcessBuffer()
generates a small token stream and the entire file has not been read, the output token buffer is underutilized.Implementation 2 addresses the problem of processing a token stream that is larger than the output buffer size limit.
/***************************************************************** * Function: fffFillBuffer() * Summary: Read fff input from stream and parse into kvtoken.h codes *****************************************************************/ int pascal _export fffFillBuffer( void *pCFContext, BYTE *pcBuf, UINT *pnBufOut, int *pnPercentDone, UINT cbBufOutMax ) { BOOL bRetVal = TRUE; TPfffGlobals *pContext = (TPfffGlobals *)pCFContext; pContext->pcBufOut = pcBuf; pContext->cbBufOutMax = 9 * cbBufOutMax / 10; /* Process the portion of the fff file that is in the input buffer but do * not return from the fffFillBuffer() function unless the output buffer is * at least 90% full. If any of the memory allocations fail during the * execution of fffProcessBuffer(), bRetVal will be set to FALSE, resulting * in this conversion failing "gracefully". */ do { if( pContext->bBufOutFull ) { pContext->bBufOutFull = FALSE; } else { fffReadSourceFile(pContext); } bRetVal = fffProcessBuffer(pContext, pcBuf); *pnPercentDone = (int)(pContext->unTotalBytesProcessed * (UINT)100 / pContext->unFileSize); }while( bRetVal && !pContext->bBufOutFull && *pnPercentDone < 100 ); *pnBufOut = (UINT)(pContext->pcBufOut - pcBuf); return (bRetVal ? KVERR_Success : KVERR_General); }
cbBufOutMax
is used to set pContext->cbBufOutMax
. This is used in fffProcessBuffer()
to monitor how full the token output buffer becomes as the source file is processed. fffProcessBuffer()
returns, and the percentage complete is calculated.pContext->cbBufOutMax
, pContext->bBufOutFull
remains set to FALSE
, and if the percentage complete is less than 100, the do-while
loop is re-entered without returning from this function to the structured access layer. There is another call to fffReadSourceFile()
, followed by fffProcessBuffer()
.pContext->cbBufOutMax
, pContext->bBufOutFull
is set to TRUE
. In this case, the do-while
loop ends, the number of bytes written to the token output buffer is calculated, and control returns to the structured access layer.fffFillBuffer()
until the entire source file is processed. fffFillBuffer()
, another empty token output buffer is provided for the custom reader to use.fffFillBuffer()
exited because the previous token output buffer exceeded allowable capacity, pContext->bBufOutFull
is reset to FALSE
and no call is made to read the next buffer from the input source file.A boundary condition can result from many situations arising from input file processing. For example, the input buffer might end with an incomplete command. In Folio flat files, this could be an incomplete element. In other word processing documents, a boundary condition might result from an incomplete control sequence, a split double-byte character, or a partial UTF-7 or UTF-8 sequence. These can be handled jointly by fffProcessBuffer()
, which must detect the boundary condition, and fffReadSourceFile()
.
The following example shows partial code used in fffReadSourceFile()
:
/**************************************************************** * * Function: fffReadSourceFile() * ***************************************************************/ int pascal fffReadSourceFile(TPfffGlobals *pContext) { int nBytes; /* Transfer remaining data to beginning of buffer prior to next read */ if( pContext->nResidualBytes ) { memcpy(pContext->cInputBuf, pContext->pcBufIn, pContext->nResidualBytes); } /* Read from file, without over-writing any text from the previous buffer */ nBytes = (*pContext->pIO->kwReadFunc)(pContext->pIO, pContext->cInputBuf + pContext->nResidualBytes, BUFFERSIZE - pContext->nResidualBytes); /* Update input buffer control parameters */ pContext->unTotalBytesRead += (UINT)nBytes; pContext->pcBufIn = pContext->cInputBuf; pContext->pcBufInMax = pContext->pcBufIn + pContext->nResidualBytes + nBytes; pContext->nResidualBytes = 0; return nBytes; }
If fffProcessBuffer()
is unable to process the entire input source file buffer, it sets the value for pContext->nResidualBytes
. When the next call to fffReadSourceFile()
is made, any residual bytes are copied to the beginning of the input source file buffer, and the number of bytes to be read is reduced to make sure that this buffer does not overflow.
A good way to test the code for boundary conditions is to vary the size of BUFFERSIZE
and make sure that the results remain consistent.
With ReadSourceFile()
, the source file can be read by calls to retrieve header or footer information. If this occurs, the value for pContext->unTotalBytesRead
is incorrect.
Implementation 3 addresses the problem of boundary conditions and interrupting calls from the structured access layer.
/**************************************************************************** * Function: fffFillBuffer() * Summary: Read fff input from stream and parse into kvtoken.h codes ****************************************************************************/ int pascal _export fffFillBuffer( void *pCFContext, BYTE *pcBuf, UINT *pnBufOut, int *pnPercentDone, UINT cbBufOutMax ) { double dTotalBytesProcessed, dFileSize; BOOL bRetVal = TRUE; TPfffGlobals *pContext = (TPfffGlobals *)pCFContext; pContext->pcBufOut = pcBuf; pContext->cbBufOutMax = 9 * cbBufOutMax / 10; /* Process the portion of the fff file that is in the input buffer but do * not return from the fffFillBuffer() function unless the output buffer is * at least 90% full. If any of the memory allocations fail during the * execution of fffProcessBuffer(), bRetVal will be set to FALSE, resulting * in this conversion failing "gracefully". */ do { if( pContext->bBufOutFull ) { pContext->bBufOutFull = FALSE; } else { fffReadSourceFile(pContext); } bRetVal = fffProcessBuffer(pContext, pcBuf); if( pContext->bHeaderCompleted ) { *pnPercentDone = 100; pContext->bHeaderCompleted = FALSE; } else if( pContext->bFooterCompleted ) { *pnPercentDone = 100; pContext->bFooterCompleted = FALSE; } else { if( pContext->unTotalBytesProcessed >= pContext->unFileSize ) { *pnPercentDone = 100; } else if( pContext->unFileSize < FFF_MAX_ULONG ) { *pnPercentDone = (int)(pContext->unTotalBytesProcessed * (UINT)100 / pContext->unFileSize); } else { dTotalBytesProcessed = pContext->unTotalBytesProcessed; dFileSize = pContext->unFileSize; *pnPercentDone = (int)(dTotalBytesProcessed * 100 / dFileSize); } } }while( bRetVal && !pContext->bBufOutFull && *pnPercentDone < 100 ); *pnBufOut = (UINT)(pContext->pcBufOut - pcBuf); return (bRetVal ? KVERR_Success : KVERR_General); }
pContext->bHeaderCompleted
and pContext->bFooterCompleted
are set to TRUE
in fffProcessBuffer()
when a header or footer is processed and the end of that portion of the document is reached.foliosr
. Folio files can be 50 MB or larger. Therefore, an unsigned integer is too small to accurately calculate the percentage complete. If the file size exceeds FFF_MAX_ULONG
, which is defined as (UINT)(0xFFFFFFFF / 0x64)
, the doubles are used for that calculation.
|