Building a Document Processing Pipeline with AWS Services

CLOUD LABS

Building a Document Processing Pipeline with AWS Services

Learn how to use Amazon’s ML services for document processing. We’ll learn to use multiple AWS services to automate the document processing cycle.

9 Tasks

beginner

1hr

Certificate of Completion

Desktop OnlyDevice is not compatible.

No Setup Required

Amazon Web Services

Learning Objectives

A familiarity with Amazon S3 and the ability to store and retrieve data using S3

The ability to use the IAM service to provide permissions to other services using IAM roles

Hands-on experience in creating a Lambda function to execute a piece of code

The ability to create a sender identity for SES and send emails using it

Hands-on experience in automating data analysis using S3, and AWS Textract and Comprehend

Technologies

Lambda

CloudWatch

IAM

Textract

Comprehend

Desktop Only

No Setup Required

Amazon Web Services

Labs Rules Apply

Stay within resource usage requirements.

Do not engage in cryptocurrency mining.

Do not engage in or encourage activity that is illegal.

Skills Covered

Using AWS Cloud Services

Natural Language Processing

Data Pipeline Engineering

Cloud Lab Overview

The traditional way to analyze documents and extract insights was through manual processing. It used to be a time-consuming process with a high probability of errors. Using AI, we can automate this process, making it much faster and more accurate. To help us do that, Amazon provides AI tools such as Textract and Comprehend. Textract can help us extract data from images and documents. The extracted data is in the form of text. This textual data can then be fed to Comprehend, an NLP tool that analyzes textual data. In response, we’ll get the necessary insights.

In this Cloud Lab, you’ll learn to automate document processing using multiple Amazon services.

To do that, you’ll first create an S3 bucket where the input and output data will be stored. After that, you’ll create an IAM role to provide necessary permissions to other AWS services. You’ll then create a Lambda function to execute a piece of code that will feed the data stored in the bucket to Textract to convert it to text. This text will then be processed using Comprehend, and the output of Comprehend will be stored in the output folder of this bucket. Finally, you’ll integrate an email service in the pipeline using Amazon SES.

After completing this Cloud Lab, you’ll have a pipeline for extracting and processing text from documents using AWS services. Completing these tasks will equip you with practical knowledge of how to utilize these AWS services to automate document processing tasks.

Cloud Lab Tasks

1.Introduction

Getting Started

2.Create the Required Resources

Create an S3 Bucket

Create an Execution Role

Create a Lambda Function

Configure the Lambda Function

3.Text Extraction and Analysis

Test the Document Processing Pipeline

Integrate Amazon Simple Email Service (SES)

4.Conclusion

Clean Up

Wrap Up

Labs Rules Apply

Stay within resource usage requirements.

Do not engage in cryptocurrency mining.

Do not engage in or encourage activity that is illegal.

Before you start...

Try these optional labs before starting this lab.

Cloud Lab

Analyzing Text and NLP with Textract and Comprehend

beginner

45m

Cloud Lab

Getting to Know AWS Lambda

beginner

2hr

Relevant Courses

Use the following content to review prerequisites or explore specific concepts in detail.

Hear what others have to say

Join 1.4 million developers working at companies like

"Your method is simple, straight to the point and I can practice with it everywhere, even from my phone, that's something I have never had in other learning platforms."

Felipe Matheus

Software Engineer

"I highly recommend Educative. The courses are well organized and easy to understand."

Adina Ong

Senior Engineering Manager

"I prefer Educative courses because they have a nice mix of text & images. I find that with full video courses, it can often be too easy to go into passive learning mode."

Clifford Fajardo

Senior Software Engineer

"I love the content on Educative and I feel as if I am definitely improving in my craft."

Thomas Chang

Software Engineer

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments

Newsletter