What is Fault Tolerance ? How DatNodes Failures are Handled in HDFS
Do you know Why #Datanodes in #HDFS cluster Fails ? Do you know how #failures are #Managed?
.
By Karthik Kondpak
.
.
.
.
.
Let's understand in depth :-
=====================
✍️Actually Datanodes are made of very cheap hardaware which is nothing but "Commodity Hardware".
✍️As they are made of very cheap Commodity Hardware they are not that good and fails /comes down/runs slowly , very frequently .
🔥Why They are made of Cheap Hardware:-
================================
✍️Just to make sure the cost of Cluster is low they are made of Cheap Hardware.
✍️Data Nodes Actually holds the data in the form of Blocks .
.
.
.
✍️Let's consider we have 4 Node cluster with Block1 stored in DataNode1 and so ... on as shown Below.
DataNode1 ------> Block1
DataNode2 --------> Block2
DataNode3 ---------> Block3
DataNode4----------> Block4
✍️Let's consider DataNode1 comes down/ fails ,then data present on Block1 also lost .
✍️This is a very Critical issue because we are losing the data also.
How DataNodes Failures are Managed in HDFS cluster :-
===============================
✍️Replication Factor comes into Picture to solve this Issue .
🔥What is Replication Factor ?
=======================
✍️Every data Block maintains three Copies of itself and stored on different machine.This is concept of Replication Factor .
✍️It says let's maintain three copies of each block and store on different machine such that if one #datanode goes down/fail we have a backup.
✍️Default Replication Factor for Hadoop is "3".
✍️Hadoop Administrator can change the default Replication Factor.
✍️When a DataNode comes down/fails we have a #replication factor to manage that.
🚀🔥When Replication Factor comes down in HDFs cluster how it is managed:-
==============================
✍️HDFS comes with a new strategy called
Fault Tolerance
What is Fault Tolerance In HDFS?
===========================
✍️Fault Tolerance Says that when Replication factor is <3 (Less than Three) then namenode will create one more copy of all blocks which are present on failed DataNode.
✍️This is just to maintain the #Replication factor as #3
✍️NameNode will create a copy of all blocks and make its entry to Metadata which it stores in its memory.
Hope you understood .....
Like share and comment
ReplyDelete