Adapting to Traffic Patterns in DynamoDB

Explore techniques Amazon DynamoDB uses to manage non-uniform traffic across partitions by adjusting throughput dynamically. Understand concepts like bursting, adaptive capacity, workload isolation, and global admission control to optimize performance and reduce throttling in distributed databases.

We'll cover the following...

Key considerations
Bursting
Adaptive capacity
Global admission control
Splitting for consumption
On-demand provisioning
Quiz
What’s next?

We will allow users to set a provisioned (allocated) throughput for their tables. The initial partitioning of a table divides a table's provisioned throughput equally across all partitions, especially if the range of keys contained in each partition is the same. However, applications might access some keys more frequently. This results in underutilizing dedicated throughput for partitions accessed that are less frequently, and overloading and downtime of partitions that are accessed more frequently. Our partitioning must adapt to customer traffic patterns to solve the problem above.

Note: We can think of throughput as the ability of our service to complete fixed-size requests (read or write) per second.

Key considerations

Let's understand the problem in detail. We will start by modeling our problem.

We define read capacity unit (RCU) as the ability of the system to complete one read request of an item of arbitrary size, say x KB. We define write capacity unit (WCU) as the ability of the system to complete one write request of an item of the same arbitrary size. We will use RCU and WCU when talking about throughput. For example, if a table has provisioned throughput of 10,000 RCUs and 5,000 WCUs, then for items of size x KB, at maximum throughput, it cannot read more than 10,000 items and cannot write more than 5,000 items per second.

Continuing this example, let's consider dividing our table into ten partitions with equal key ranges (the disjoint ranges of keys that identify items hosted by a partition). Our design works such that it assumes that all keys have an equal chance of being accessed. As a result, it divides throughput equally among all partitions. So after dividing a table into ten partitions, a single partition will have a throughput of 1,000 RCUs and 500 WCUs. However, what happens when we wish to add or remove partitions from the table or if the table's provisioned throughput changes? One solution is to continue with our assumption that keys are accessed uniformly. Therefore, we will allocate all partitions' throughput based on the table's provisioned throughput.

When adding or removing partitions, we will distribute a table's provisioned throughput equally among the new number of partitions. So if we added ten more partitions in the table from our example, the total number of partitions would be 20. Each partition will have a throughput of 500 RCUs and 250 WCUs. Moreover, if we removed five partitions from the original number of partitions (ten), the resulting five would have a throughput of 2000 RCUs and 1000 WCUs.
If the table has provisioned throughput changes, the new throughput would be equally ...

1.Prologue

2.File Systems

3.Google File System (GFS)

4.Google Colossus File System

5.Facebook's Tectonic File System

6.Databases

7.Google Bigtable

8.Google Megastore

9.Google Spanner

10.Key-value Stores

11.Many-core Key-value Store

12.Scaling Memcache

13.SILT

14.Amazon DynamoDB

15.Concurrency Management

16.Two-phase Locking (2PL)

17.Google Chubby Locking Service

18.ZooKeeper

19.Big Data Processing: Batch to Stream Processing

20.MapReduce

21.Spark

22.Kafka

23.Consensus

24.Understanding Consensus: Two Generals, FLP, & Byzantine Generals

25.Two-phase Commit

26.State Machine Replication

27.Paxos

28.Raft

29.Epilogue

Adapting to Traffic Patterns in DynamoDB

Key considerations