Search⌘ K
AI Features

Amazon Textract

Explore Amazon Textract to understand how it uses machine learning to extract text, forms, and table data from scanned documents and images. Learn its key features like custom queries and AnalyzeID APIs and how it automates processes such as expense tracking and inventory report management.

Optical Character Recognition (OCR) is a process to convert images of typed text into machine encoded text documents. Companies use OCR technologies to digitize text and data from documents such as PDFs, scanned images, and physical records. However, the OCR technologies had their limitations, as they were unable to extract text from some layouts, such as forms and tables. This did not fulfill the requirements of companies to accurately identify and extract data from any file type.

Recognizing the shortcomings of traditional OCR technologies, Amazon introduced a new machine learning service, Amazon Textract. It allows accurate text extraction from documents and layouts of any type. It can also detect typed and handwritten text from records and reports and can be integrated into applications through the Textract ...