DynamoDB: Secondary Indexes and Querying

Explore how to efficiently retrieve and manage data in DynamoDB through queries, scans, filters, and effective indexing strategies.

We'll cover the following...

Query vs. scan
- Parallel scans
- Filters and projections
Indexing in DynamoDB
- Local Secondary Indexes (LSIs)
- Global Secondary Indexes (GSIs)
Index maintenance and best practices
Conclusion

Query vs. scan

DynamoDB provides two primary operations to retrieve multiple items:

Query operation: A query is used when we know the partition key and want to retrieve all (or a subset of) items that share that key. Queries can be efficient, as DynamoDB uses the partition key to locate the appropriate data partitions and then applies optional filters to narrow the results further.
Scan operation: A scan reads every item in the table, regardless of key. This makes scans significantly more resource-intensive, especially as the table size grows. Scans are often used during data migrations, audits, or administrative tasks, not during routine application queries.

Note: When possible, we should always prefer a query over a scan, especially in latency-sensitive applications. Queries use partition key targeting, while scans consume more read capacity and take longer.

The illustration below depicts the impact of a scan on the provisioned throughput of a DynamoDB table. Notice that as the size of requests becomes nonuniform, throttling (shown by the rectangle bar) becomes more frequent.