Building a Real-Time Data Pipeline with Kinesis and S3 Tables

Building a Real-Time Data Pipeline with Kinesis and S3 Tables
Building a Real-Time Data Pipeline with Kinesis and S3 Tables

CLOUD LABS



Building a Real-Time Data Pipeline with Kinesis and S3 Tables

In this Cloud Lab, you will build a real-time data pipeline using Kinesis Data Streams, Kinesis Firehose, and S3 tables, then query the ingested data with Athena in Apache Iceberg format.

7 Tasks

intermediate

1hr 30m

Certificate of Completion

Desktop OnlyDevice is not compatible.
No Setup Required
Amazon Web Services

Learning Objectives

An understanding of Amazon Kinesis Data Streams and their role in real-time data ingestion
Working knowledge of configuring Kinesis Data Firehose to deliver data into S3 Tables
The ability to query structured streaming datasets stored in Apache Iceberg tables using Amazon Athena
Practical knowledge of integrating AWS services to build a real-time analytics pipeline

Technologies
S3 logoS3
Athena
Kinesis
Cloud Lab Overview

Real-time data processing has become essential for monitoring, fraud detection, IoT telemetry, and personalized recommendations. Amazon Kinesis provides a scalable and fully managed solution for ingesting streaming data at scale, while Amazon S3 table buckets (Apache Iceberg) enable efficient querying and analytics on structured datasets.

In this Cloud Lab, you’ll set up a Kinesis Data Stream as the entry point for real-time events. You will then configure a Kinesis Data Firehose delivery stream to capture, process, and store this data into an S3 table bucket. You’ll simulate real-time events being sent into the pipeline using a Python-based data generator script. Finally, you’ll use Amazon Athena to query and analyze the data stored in Iceberg tables.

After completing this Cloud Lab, you’ll have the skills to design and implement a serverless streaming data pipeline on AWS, build real-time ingestion workflows, store data in queryable formats, and use Athena for analytics. These are valuable skills for data engineering, analytics, and cloud-based big data systems careers.

The following is the high-level architecture diagram of the infrastructure you’ll create in this Cloud Lab:

Building a data pipeline using Kinesis and an S3 table
Building a data pipeline using Kinesis and an S3 table

Cloud Lab Tasks
1.Introduction
Getting Started
2.Set Up an S3 Table and Athena
Create an S3 Table and an S3 Bucket
Configure an S3 Table
3.Set Up the Kinesis Data Stream
Create a Kinesis Firehose Delivery Stream
Generate Event Data with Python
4.Conclusion
Clean Up
Wrap Up
Labs Rules Apply
Stay within resource usage requirements.
Do not engage in cryptocurrency mining.
Do not engage in or encourage activity that is illegal.

Relevant Course

Use the following content to review prerequisites or explore specific concepts in detail.

Hear what others have to say
Join 1.4 million developers working at companies like