In this Cloud Lab, you’ll learn how modern data platforms evolve beyond basic file storage to support transactional, structured, and queryable datasets. You’ll use Amazon S3 paired with Apache Iceberg to enable schema evolution, ACID transactions, and high-performance analytics directly on object storage. With Amazon Athena, you’ll run SQL queries without managing servers or infrastructure, making it easy to explore and analyze your Iceberg tables.
You’ll begin by transforming raw S3 data into Iceberg-backed tables and defining their schemas. You’ll then use Athena SQL to load and query the data and perform inserts, updates, deletes, and merges. Finally, you’ll explore Iceberg’s snapshot and time travel capabilities to track historical changes and compare past versions of your datasets. Together, these skills will help you build governed, analytics-ready data lakes on AWS using scalable, open table formats.
The following is the high-level architecture diagram of the infrastructure you’ll create in this Cloud Lab: