OCR Results

This section describes the format of the results produced by an OCR analysis task.

Results by Line

The following XML shows records from the Result track of an OCR task. The analysis engine produces one record for each line of text in the analyzed image or video frame.

If you are processing a document, then unless you have set ProcessTextElements=FALSE, some of the records in the Result track could represent text that has been extracted from text elements that were present in the document.

<record>
  ...
  <trackname>ocr.Result</trackname>
  <OCRResult>
    <id>14565401-b521-4135-94c8-b30f02264f38</id>
    <text>rover discovers life on Mars</text>
    <region>
      <left>240</left>
      <top>31</top>
      <width>194</width>
      <height>12</height>
    </region>
    <confidence>89</confidence>
    <angle>0</angle>
    <source>image</source>
  </OCRResult>
</record>
<record>
  ...
  <trackname>ocr.Result</trackname>
  <OCRResult>
    <id>59dad245-c268-4506-ac42-5752dd123576</id>
    <text>discovery confirmed yesterday and announced to world press</text>
    <region>
      <left>120</left>
      <top>62</top>
      <width>434</width>
      <height>15</height>
    </region>
    <confidence>88</confidence>
    <angle>0</angle>
    <source>image</source>
  </OCRResult>
</record>

Each record contains the following information:

Results by Word

An OCR analysis task that analyzes an image or document (but not video) also produces a WordResult output track. To this track the OCR analysis engine adds a record for each word. The following XML shows an example record.

NOTE:

Text that is extracted from a text element in a document is not output to the WordResult track.

<record>
  ...
  <trackname>ocr.WordResult</trackname>
  <OCRResult>
    <id>cdbca09b-c289-40af-b6e6-02427fafad91</id>
    <text>rover</text>
    <region>
      <left>240</left>
      <top>31</top>
      <width>194</width>
      <height>12</height>
    </region>
    <confidence>89</confidence>
    <angle>0</angle>
    <source>image</source>
  </OCRResult>
</record>

_HP_HTML5_bannerTitle.htm