Data Operations and Support I

Explore how to manage complex AWS data pipelines involving Kinesis, Glue, and Redshift with proper orchestration and conditional logic. Understand techniques to improve Athena query performance for partitioned data. Learn troubleshooting methods for Glue ETL jobs under memory pressure. Discover audit logging solutions with CloudTrail and best practices for creating materialized views in Redshift that refresh automatically.

We'll cover the following...

Question 40
Question 41
Question 42
Question 43
Question 44

Question 40

A media company has a complex data pipeline that ingests data from Amazon Kinesis Data Streams, transforms it with AWS Glue, and loads it into Amazon Redshift. The pipeline has multiple dependent stages that must execute in a specific order, with conditional branching based on the success or failure of each stage. The company needs a fully managed orchestration solution that provides visual workflow tracking and supports error handling with retry logic.

Which orchestration service should the data engineer select to meet these requirements with the least operational overhead?

A. Use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to define a DAG that orchestrates each pipeline stage, configuring task retries and branching operators for conditional logic.

B. Use AWS Step Functions to define a state machine with Choice states for conditional branching, Retry and Catch fields for error handling, and native service integrations for Kinesis, Glue, and Redshift.

C. Use AWS Glue workflows with triggers to orchestrate the Glue crawlers and ETL jobs, configuring conditional triggers based on job completion status for branching logic.

D. Use Amazon EventBridge rules to chain the pipeline stages together, ...

1.Introduction

2.Data Ingestion Architectures

Cloud Lab

3.AWS Data Stores

Cloud Lab

4.Data Cataloging and Lifecycle Management

5.Data Processing and Programming Logic

Cloud Lab

Cloud Lab

Cloud Lab

6.Pipeline Orchestration and Operations

Cloud Lab

Cloud Lab

Cloud Lab

7.Data Analysis and Quality Control

Cloud Lab

Cloud Lab

8.Pipeline Monitoring, Maintenance, and Auditing

Cloud Lab

Cloud Lab

9.Data Security and Governance

Assessment

10.Practice Exam Solution 1: AWS Certified Data Engineer – Associate

11.Free AWS Certified Data Engineer Associate Practice Exam

12.Conclusion

Data Operations and Support I

Question 40