Tuesday 18 August 2015

Why the replication factor of three is standard for HDFS?



DataNodes actually contains the data, so it is very important server in HDFS (Hadoop Distributed File System). If any DataNode goes down then we will lost the data. So as a solution the replication of the DataNode requires. From the replication of the DataNode we can get back the data. But there is  a question why the replication is three? The answer is when one DataNode goes down First one replication becomes the active DataNode Server and Second one is used as a backup of the active DataNode Server and the third one is used to make the replication of the active DataNode server. It is the default replication factor but we can increase the replication factor as per our need.



No comments:

Post a Comment