Optical Character Recognition

Optical Character Recognition (OCR) recognizes text in media. This includes text that appears in images, video, and text embedded in PDF files and Office document file formats.

Configuration Parameter Description
Blacklist Characters to exclude from the character set used for recognition.
CharacterTypes The types of characters to include in the character set used for recognition.
ContextCheck Specifies whether to use context checking to improve OCR results
ErrorCharacter The character to use to represent an unreadable character in the ingested media.
FontType The basic character type of the text that you want to recognize
HollowText Specifies whether to look for outlined text.
ImageBinarizeMethod (Deprecated) The method to use to binarize color images and documents.
Input The image track to process.
KeepOnly Keep only particular types of words and discard all others.
Languages The languages to use, which affects the character set and dictionaries used.
NumParallel The maximum number of video frames to analyze simultaneously.
OcrMode The OCR mode to use when you ingest images or documents.
Orientation The orientation of text in the ingested media.
ProcessTextElements Specifies whether to merge the content of text elements into the OCR results.
Region A region of the image or video frame to restrict processing to.
RegionUnit The units that the Region parameter uses to specify the size and position of a region.
RestrictToInputRegion Specifies whether to analyze a region of the input image or video frame that is specified in the input record, instead of the entire image.
SampleInterval The interval at which frames are selected to be analyzed.
Spacing Specifies whether to allow multiple spaces between words in the output from OCR.
Type The analysis engine to use. Set this parameter to OCR.
UserDictionary A comma-separated list of dictionaries to use in addition to the standard dictionaries.
Whitelist Extra characters to add to the character set.
WordRejectThreshold The minimum confidence level required to include a word in the output.

OutputTracks

Output track Type Description Output1This column indicates whether the information contained in the track is included by default in the output created by an output task (when you don't set the Input parameter for the output task).
Data OCRResult Contains a record for every frame in which text is detected. No
DataWithSource OCRResultAndImage Contains the same information as the Data track, but each record also includes the source frame. No
Event OCREvent The engine creates a record in the Event track when text appears or disappears in the video, or when the text changes. Yes
Result OCRResult Contains a single record for each example of text (the same text might appear in many consecutive frames). This track only contains the best result from processing an example of text. Yes
ResultWithSource OCRResultAndImage Contains the same information as the Result track, but each record also includes the best source frame. No
WordResult OCRResult Contains a record for each word read from the source. The records in this track provide the text, region, and confidence score for individual words, rather than lines. This track is available only when you ingest images or documents. It is not available if the source is a video file or stream. No

OCRResult

Field name Type Description
id UUIDData A universally unique identifier to identify the text
text TextData The result of running OCR on the text
region RectangleData The location of the text in the frame
confidence Integer The confidence score from OCR, or 100 for text extracted from text elements.
angle Integer The orientation of the text in degrees (rotated clockwise 0, 90, 180, or 270 degrees from upright).
source String Specifies the origin of the text: static text from an image or video (image), text from video of a news ticker, with text scrolling from right to left (scroller, left), or a text element in a document (text).

OCRResultAndImage

Field name Type Description
id UUIDData A universally unique identifier to identify the text
text TextData The result of running OCR on the text
region RectangleData The location of the text in the frame
confidence Integer The confidence score from OCR, or 100 for text extracted from text elements.
angle Integer The orientation of the text in degrees (rotated clockwise 0, 90, 180, or 270 degrees from upright).
source String Specifies the origin of the text: static text from an image or video (image), text from video of a news ticker, with text scrolling from right to left (scroller, left), or a text element in a document (text).
image ImageData The source frame

OCREvent

Field name Type Description
id UUIDData A universally unique identifier to identify the text
event TrackingEventData The type of event (begin/end/update), and the elapsed time since the text appeared.
text TextData The result of running OCR on the text

_HP_HTML5_bannerTitle.htm