Ensuring Scalability and Replication

Learn how consistent hashing enables scalability, and how we replicate such partitioned data.

Adding scalability

Let’s start with one of the core design requirements, i.e., scalability. We are storing key-value data in storage nodes. With the change in demand, we might need to add or remove storage nodes. It means we need to partition data over the nodes in the system to distribute the load across all nodes.

For example, let’s consider that we have four nodes, and we want 25% of the requests to go to each node to balance the load equally. The traditional way of solving this is through the modulus operator. When a request comes in, assign a request ID, calculate its hash, and find the remainder by taking the modulus with the number of nodes available. The remainder value is the node number, and we send the request to that node to process it.

The following slides explain this process.

We want to add and remove nodes with minimal change in our infrastructure. But in this method, when we add or remove a node, we will need to move a lot of keys which is inefficient. For example, node 2 is removed, and say for the same request ID, now the new server to process a request will be node 1 as 10 % 3 = 1. Nodes hold information in their local caches like keys and their values. So we need to move that request’s data to the next node that has to process the request. But this replication can be costly and can cause high latency.

Next, we will see how to copy data efficiently.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy

Introduction

Abstractions

Non-functional System Characteristics

Back-of-the-Envelope Calculations

Building Blocks

Domain Name System (DNS)

Sequencer

Rate Limiter

Distributed Cache

Blob Store

Content Delivery Network (CDN)

Load Balancers

Key-Value Store

Distributed Messaging Queue

Pub-sub

Distributed Task Scheduler

Distributed Search

Distributed Logging

Distributed Monitoring

Monitoring Server Side Errors

Monitoring Client Side Errors

Databases

Sharded Counters

Concluding Building Blocks

Design YouTube

Design Quora

Design Google Maps

Designing a Proximity Server like Yelp

Design Uber

Design Twitter

Newsfeed System

Design Instagram

Design URL Shortening Service / TinyURL

Design a Web Crawler

Design WhatsApp

Design Typeahead Suggestion

Design Collaborative Document Editing Service / Google Docs

Spectacular Failures

Concluding Remarks

Appendix: System Design Interviews

All content below this will likely go away

Design Exercises

Archived temporary lessons

Design Resource Allocator for a Large Datacenter

Design Zoom

Continuous Monitoring using Data Processing

Design Live Commenting at Facebook

Security

For Noor: Placeholder for Illustration Making

Appendix

Backup of our Lessons

Caching Billions of Tiny Objects on Flash

Design Quora

Copy-Design YouTube

Identity & Access Management

Copy of CDN (02-03-2022)

Ensuring Scalability and Replication

Adding scalability

Consistent hashing