Media Server can run Optical Character Recognition (OCR) on images such as scanned documents and photographs of documents. You can also run OCR on video to extract subtitles and scrolling text that sometimes appears during television news broadcasts.

Media Server OCR:


Media Server OCR recognizes machine-printed text. Handwritten text is not supported.

OCR Document File Formats

When you ingest a PDF or office document file, Media Server extracts both embedded images and text elements. The OCR engine runs OCR on the images that are extracted from the document, and by default, merges the text that was contained in text elements into the results. This means that the OCR results contain both the text that is extracted from images and the text that was contained in text elements.