When data stored by a system on a single node becomes too large, it is broken-up into parts and each part is stored on a separate node. In this context, “large” is subjective, but is generally taken to mean that the data size has grown to an extent where storing additional data on the system isn’t possible or executing operations on the data e.g. querying, indexing, etc fail to meet SLA requirements.

Generally each portion or part of data is referred to as a partition, however, different systems have different names for a partition. For instance:

  1. Cassandra and Riak call a partition a vnode

  2. MongoDB, SolrCloud, and ElasticSearch call a partition a shard

  3. HBase calls a partition a region

  4. Bigtable calls a partition a tablet

  5. Couchbase calls a partition a vbucket

Get hands-on with 1200+ tech skills courses.