Deep Dive into the Internals of the Database/

...

How to Partition Secondary Indexes

Learn about different partitioning strategies for secondary indexes in the database.

We'll cover the following...

Introduction
Partition by document
Partition by term

Introduction

A database index is an additional data structure that allows us to locate the required data quickly without going through the entire dataset in the database. A secondary index is a mechanism to efficiently access records in a database through attributes other than the primary key. While the lookup by primary key always returns a single record, a query on the secondary index can return multiple records.

Partitioning a database with secondary indexes is inherently complex, as the partitioning strategy applies both to the primary dataset and the secondary index.

Broadly speaking, there are two strategies to partition the secondary index:

Partition by document
Partition by term

Partition by document

In the partition by document strategy, every partition acts as an independent database on its own. This is because every partition hosts both the primary dataset and its secondary indexes.

The example above illustrates how each partition is self contained, hosting the primary data set and its secondary indexes:

The illustration above encapsulates a user entity with the following properties:
- UserID is the primary lookup key.
- Fname, Lname, and Age are the attributes of the entity.
- Additionally, there is a secondary index where the client can search by FName to get all the associated primary records.
The database has two partitions, namely Partition 1 and Partition 2.
Each partition hosts primary datasets and their secondary indexes.
Partition 1 includes:
- Records for User ID: ...

Introduction

Taxonomy of Databases

Database Architecture

Data Structures used in Databases

Disk Layout

Database Index

Transaction

Replication

Partitioning

Concurrency Controls

Consistency Models

Consensus

Common Problems Associated with Distributed Databases

Conclusion

Database Internals Assessment

How to Partition Secondary Indexes

Introduction

Partition by document