What is Fault Tolerance ? How DatNodes Failures are Handled in HDFS

 
Do you know Why #Datanodes in #HDFS cluster Fails ? Do you know how #failures are #Managed?
.
By Karthik Kondpak

.
.
.
.
.

Let's understand in depth :-
=====================

✍️Actually Datanodes are made of very cheap hardaware which is nothing but "Commodity Hardware".

✍️As they are  made of very cheap Commodity  Hardware they are not that good and fails /comes down/runs slowly , very frequently .

🔥Why They are made of Cheap Hardware:-
================================
✍️Just to make sure the cost of Cluster is low they are made of Cheap Hardware.

✍️Data Nodes Actually holds the data in the form of Blocks .
.
.
.

✍️Let's consider we have 4 Node cluster with Block1 stored in DataNode1 and so ... on as shown Below.

DataNode1 ------> Block1

DataNode2 --------> Block2

DataNode3 ---------> Block3

DataNode4----------> Block4

✍️Let's consider DataNode1 comes down/ fails ,then data present on Block1 also lost .

✍️This is a very Critical issue because we are losing the data also.


How DataNodes Failures are Managed in HDFS cluster :-
===============================
✍️Replication Factor comes into Picture to solve this Issue .

🔥What is Replication Factor ?
=======================

✍️Every data Block maintains three Copies of itself and stored on different machine.This is concept of Replication Factor .

✍️It says let's maintain three copies of each block and store on different machine such that if one #datanode goes down/fail we have a backup.

✍️Default Replication Factor for Hadoop is "3".

✍️Hadoop Administrator can change the default Replication Factor.

✍️When a DataNode comes down/fails  we have a #replication factor to manage that.

🚀🔥When Replication Factor comes down in HDFs cluster how it is managed:-
==============================

✍️HDFS comes with a new strategy called
Fault Tolerance

What is Fault Tolerance In HDFS?
===========================
✍️Fault Tolerance Says that when Replication factor is <3 (Less than Three) then namenode will create one more copy  of all blocks which are present on failed DataNode.

✍️This is just to maintain the #Replication factor as #3

✍️NameNode will create a copy of all blocks and make its entry to Metadata which it stores in its memory.

Hope you understood .....

Comments

Post a Comment

Popular Posts