Trusted answers to developer questions

What is database sharding?

Free System Design Interview Course

Many candidates are rejected or down-leveled due to poor performance in their System Design Interview. Stand out in System Design Interviews and get hired in 2024 with this popular free course.


Database sharding is the process of making partitions of data in a database or search engine, such that the data is divided into various smaller distinct chunks, or shards.

Each shard could be a table, a Postgres schema, or a different physical database held on a separate database server instance.

svg viewer

Some data within the database remains present in all shards (vertical sharding), but some appear only in single shards (horizontal sharding). The following figure illustrates vertical sharding and horizontal sharding.

To shard your data, you need to decide a key, called a sharding key, to partition your data on. The shard key is either an indexed field or indexed compound fields that exist in every document in the collection.

There is no general rule to select a sharding key; what key you choose depends on your application. For instance, you may choose userID as the shard key in a social media app.


<img src="/api/edpresso/shot/4827308338708480/image/5741031244955648" alt=“Markdown Monster icon”

width=“230” />

Sharding allows your application to make fewer queries. When it receives a request, the application knows where to route the request and thus it has to look through less data, rather than going through the whole database.

It improves the performance of your application, and lets you rest easier, not having to worry about scalability issues.

RELATED TAGS

system design
database sharding
database systems
database performance
Copyright ©2024 Educative, Inc. All rights reserved
Did you find this helpful?