DynamoDB Capacity and Scaling
Explore how DynamoDB manages capacity through on-demand and provisioned modes, auto scaling, and adaptive capacity. Understand how to calculate read and write units and prevent hot partitions to optimize cost and performance.
We'll cover the following...
In the previous lesson on secondary indexes, every additional GSI or LSI introduced extra write cost because DynamoDB must propagate changes to each index structure. That write amplification raises an immediate question: how does DynamoDB actually allocate and manage the throughput those writes consume? The answer lies in capacity planning, and getting it wrong leads to throttled requests, inflated bills, or both.
DynamoDB is engineered to deliver single-digit-millisecond latency at virtually any scale, but that promise depends on selecting the right capacity configuration for your workload. Two capacity modes govern how throughput is allocated. On-demand mode serves as the default and automatically accommodates traffic fluctuations without any upfront throughput settings. Provisioned mode requires the operator to declare explicit read and write capacity, giving tighter cost control at the expense of manual planning.
This lesson explains on-demand and provisioned capacity modes, then examines how auto scaling works for provisioned capacity, the partition-level limits and access patterns that can create hot partitions, and the read and write request unit calculations that drive capacity planning, cost estimates, and performance expectations.
On-demand vs. provisioned mode
On-demand mode bills per request rather than per hour of provisioned capacity. When a traffic spike arrives, DynamoDB instantly allocates additional throughput behind the scenes, so the application never needs to predict or preset capacity values. This makes on-demand the natural fit for variable, unpredictable, or brand-new workloads where traffic patterns have not yet been established. DynamoDB tracks the previous peak traffic level for an on-demand table and can accommodate up to double that peak within approximately 30 minutes, scaling further as sustained demand grows.
Provisioned mode takes the opposite approach. The operator sets explicit