Change Data and Events
Explore how Amazon DocumentDB change streams capture data mutations as ordered events, supporting event-driven systems. Understand resume tokens, retention settings, and Lambda integration. Learn methods to handle failover and replica lag for reliable downstream processing.
In the previous lesson, you explored how elastic clusters introduce sharding to Amazon DocumentDB for horizontal scale. This lesson shifts focus from how data is distributed to how data mutations flow out of DocumentDB as events. When a document is inserted, updated, replaced, or deleted, that change can be surfaced as a structured event to any downstream system that needs to react. This capability, known as
Amazon DocumentDB implements CDC through change streams, a native feature that provides a time-ordered sequence of document-level change events scoped to a collection, a database, or an entire cluster. Each event in the stream carries several pieces of information: a resume token that marks the consumer's position, an operation type such as insert or delete, the full or partial document body, and a cluster timestamp. Ordering is guaranteed within a single collection's stream, which means consumers can replay events deterministically without worrying about out-of-order delivery.
Attention: Do not confuse DocumentDB change streams with DynamoDB Streams, MongoDB self-managed oplog tailing, or AWS DMS and Kinesis-based CDC pipelines. When the exam or a design scenario specifies DocumentDB as the source and requires ordered, document-level change events, the correct answer is native change streams.
This ordering guarantee is what makes change streams suitable for architectures where a missed or reordered event would corrupt downstream state. A search index, an analytics pipeline, or a microservice workflow can each open an independent cursor on the same stream and process events at its own pace.
The following diagram illustrates how change events flow from a DocumentDB cluster through a change stream cursor to multiple independent consumers.
With the overall flow established, the next critical concept is how consumers track their position in the stream so they can recover safely after failures.
Resume tokens and consumer recovery
A