Building ETL Pipelines on AWS

Building ETL Pipelines on AWS
Building ETL Pipelines on AWS

CLOUD LABS



Building ETL Pipelines on AWS

In this Cloud Lab, you’ll learn how to create an ETL data pipeline with AWS Glue.

8 Tasks

intermediate

3hr

Certificate of Completion

Desktop OnlyDevice is not compatible.
No Setup Required
Amazon Web Services

Learning Objectives

A thorough understanding of AWS Glue ETL
The ability to set up a visual ETL pipeline
Hands-on experience performing ETL operations on a dataset

Technologies
DynamoDB logoDynamoDB
S3 logoS3
Glue
Cloud Lab Overview

AWS Glue is a serverless data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources. It provides ETL (extract, transform, load) service, which is a process used in data engineering to extract data from various sources, transform it into a desired format, and load it into a target data store for analysis, reporting, and business intelligence. AWS Glue simplifies the ETL process, making it easier for businesses to prepare and transform their data for analytics.

In this Cloud Lab, you’ll create a DynamoDB table as source data. You’ll set up a database in AWS Glue with the DynamoDB table as its source. After that, you’ll use the AWS Glue crawler to fetch metadata from the DynamoDB table and into Data Catalog tables in the Glue database. You’ll then set up an ETL pipeline in AWS Glue and extract data from the Glue database, perform transformations on the data, and load the resulting data in the S3 bucket.

After the completion of this Cloud Lab, the provisioned infrastructure will be similar to the one given below:

Architecture diagram of ETL pipelines utilizing AWS Glue and S3 for data transformation and storage
Architecture diagram of ETL pipelines utilizing AWS Glue and S3 for data transformation and storage

Cloud Lab Tasks
1.Introduction
Getting Started
2.Set Up the Data Stores
Create a DynamoDB Table
Configure and Run a Glue Crawler
Create an S3 bucket
3.Build ETL Pipeline
Create a Visual ETL Pipeline with AWS Glue
Configure and Run the ETL Job
4.Conclusion
Clean Up
Wrap Up
Labs Rules Apply
Stay within resource usage requirements.
Do not engage in cryptocurrency mining.
Do not engage in or encourage activity that is illegal.

Before you start...

Try these optional labs before starting this lab.

Relevant Course

Use the following content to review prerequisites or explore specific concepts in detail.

Hear what others have to say
Join 1.4 million developers working at companies like