Block Replication

This lesson discusses how HDFS replicates data blocks across the cluster for redundancy.

We'll cover the following...

Block Replication
Selection of nodes for replication

Block Replication

Before discussing what makes HDFS fault-tolerant and highly available, let’s first understand the term fault tolerance. Wikipedia defines fault tolerance as the property that enables a system to continue operating properly in the event of the failure of some of its components. Said another way, fault tolerant computing is a form of full hardware redundancy. Two (or more) systems operate in tandem, mirroring identical applications and executing instructions in lock step. When a hardware failure occurs in the primary system, the secondary system running an identical ...

Hadoop

YARN

Map Reduce

HDFS

Spark

Input & Output Formats

Misc

Quiz

Reference: Replication

Reference: Partitioning

Reference: Transactions

Reference: Issues in Distributed Systems

Block Replication

Block Replication