What is database sharding?

Database sharding is the process of making partitions of data in a database or search engine, such that the data is divided into various smaller distinct chunks, or shards.

Each shard could be a table, a Postgres schema, or a different physical database held on a separate database server instance.

Some data within the database remains present in all shards (vertical sharding), but some appear only in single shards (horizontal sharding). The following figure illustrates vertical sharding and horizontal sharding.

To shard your data, you need to decide a key, called a sharding key, to partition your data on. The shard key is either an indexed field or indexed compound fields that exist in every document in the collection.

There is no general rule to select a sharding key; what key you choose depends on your application. For instance, you may choose userID as the shard key in a social media app.

<img src="/api/edpresso/shot/4827308338708480/image/5741031244955648" alt=“Markdown Monster icon”

width=“230” />

Sharding allows your application to make fewer queries. When it receives a request, the application knows where to route the request and thus it has to look through less data, rather than going through the whole database.

It improves the performance of your application, and lets you rest easier, not having to worry about scalability issues.

Relevant Answers

Explore Courses

Free Resources