Output Records

This section describes the records that are produced when you ingest an image or document file.

Image Data

The image ingest engine writes image data to a track named Image_1.

The following sample XML shows a record produced when you ingest an image, multi-page image such as a TIFF file, or a presentation file (.PPT, .PPTX, .ODP).

<record>
  <pageNumber>1</pageNumber>
  <trackname>Image_1</trackname>
  <Page>
    <image>
      <imagedata format="PNG">...</imagedata>
      <width>222</width>
      <height>140</height>
      <pixelAspectRatio>1:1</pixelAspectRatio>
      <format>PNG</format>
      <compressionQuality>100</compressionQuality>
    </image>
    <pagetext/>
  </Page>
</record>

The record contains the following information:

If you ingest a document such as a PDF file, the output might also include the text extracted from text elements:

<record>
  <pageNumber>1</pageNumber>
  <trackname>Image_1</trackname>
  <Page>
    <image>
      <imagedata format="PNG">...</imagedata>
      <width>892</width>
      <height>1260</height>
      <pixelAspectRatio>1:1</pixelAspectRatio>
      <format>PNG</format>
      <compressionQuality>100</compressionQuality>
    </image>
    <pagetext>
      <element>
        <text>Some text</text>
        <region>
          <left>115</left>
          <top>503</top>
          <width>460</width>
          <height>41</height>
        </region>
        <angle>0</angle>
      </element>
      ...
    </pagetext>
  </Page>
</record>

The pagetext element contains information about associated text elements. If the ingested media was a PDF file, each record represents a page. If the ingested media was another type of document the record represents an embedded image and the text that follows it, up to the next embedded image.

Each element element describes a text element and contains the following data:

Information about text elements is used by the OCR analysis engine, which automatically combines the text elements with the text extracted from images, to produce a complete transcript of the text that appears on the page.

Source Information

The image ingest engine produces a proxy track, named taskName.proxy, where taskName is the name of your ingest task. The purpose of the proxy track is to contain information about the ingested source. The engine produces one record in this track for each page in the ingested image or document.

The following XML shows a sample record:

<record>
  <pageNumber>1</pageNumber>
  <trackname>ImageIngestTask.Proxy</trackname>
  <proxy path="./image.jpeg" url="./image.jpeg" mimeType="image/jpeg"
   estimatedDuration="0" pages="1">
    <streams>
      <videoStream id="0" width="2592" height="1936" sar="1:1" codec=""/>
    </streams>
    <metadata>
      <tag name="Author">A Name</tag>
      <tag name="Creation Date">2014-04-09T09:15:19Z</tag>
      <tag name="Flash">Flash did not fire</tag>
      <tag name="GPS Latitude">52° 13' 10.69"</tag>
      <tag name="GPS Longitude">0° 8' 49.23"</tag>
      ...
    </metadata>
  </proxy>
</record>

The metadata element contains any metadata that Media Server was able to extract from the source. The information present in this element varies based on the format of the source file and the information present in the source.


_HP_HTML5_bannerTitle.htm