The DataNode is responsible for storing and retrieving data from the data files in HDFS. It manages the file blocks within the node. It sends information to the NameNode about the files and blocks stored in that node and responds to the NameNode for all filesystem operations and send signal to the NameNode if it is alive or not.
DataNode replicates blocks with other data nodes in case of recovery from failure or load balancing. DataNodes makes three replica of each splitted Data because if data in one DataNode goes Down we can recover the data from those replicas.
It supports Data Localization support with direct communication with TaskTracker.