Big Data Training ( Hadoop,HDFS, PIG, HIVE): What is HDFS (Hadoop Distributed File System)

In Apache Hadoop project,The Hadoop Distributed File System (HDFS) is considered as a sub-project.

Hadoop File System was developed using distributed file system design. It is run on commodity hardware. Unlike other distributed systems, HDFS is highly fault tolerant and designed using low-cost hardware.

HDFS holds very large amount of data and provides easier access. To store such huge data, the files are stored across multiple machines. These files are stored in redundant fashion to rescue the system from possible data losses in case of failure. HDFS also makes applications available to parallel processing.

According to The Apache Software Foundation, the primary objective of HDFS is to store data reliably even in the presence of failures including NameNode failures, DataNode failures and network partitions.

The NameNode is a single point of failure for the HDFS cluster and a DataNode stores data in the Hadoop file management system.of the . This Apache Software Foundation project is designed to provide a fault-tolerant file system designed to run on commodity hardware.

HDFS uses a master/slave architecture in which one device (the master) controls one or more other devices (the slaves). The HDFS cluster consists of a single NameNode and a master server manages the file system namespace and regulates access to files.

Big Data Training ( Hadoop,HDFS, PIG, HIVE)

Wednesday, 22 July 2015

What is HDFS (Hadoop Distributed File System)

No comments:

Post a Comment