Paper Title

Improved Distribution of Workload in Hadoop Yarn

Authors

  • Rajneesh Kumar

Keywords

Cluster, HDFS, MapReduce, Node Replica, High Availability, Load Balancing.

Abstract

Hadoop YARN is a software framework that supports data intensive distributed application. Hadoop creates clusters of machine and coordinates the work among them. It include two major component, HDFS (Hadoop Distributed File System) and MapReduce. HDFS is designed to store large amount of data reliably and provide high availability of data to user application running at client. It creates multiple data blocks and store each of the block redundantly across the pool of server s to enable reliable, extreme rapid computation. MapReduce is software framework for the analyzing and transforming a very large data set in to desired output. This paper focus on how the replicas are managed in HDFS for providing high availability of data under extreme computational requirement. Later this paper focus on possible failure that will affect the Hadoop cluster and which are failover mechanism can be deployed for protecting the cluster.

Article Type

Published

How To Cite

Rajneesh Kumar. "Improved Distribution of Workload in Hadoop Yarn".INTERNATIONAL JOURNAL OF ENGINEERING DEVELOPMENT AND RESEARCH ISSN:2321-9939, Vol.3, Issue 1, pp.202-207, URL :https://rjwave.org/ijedr/papers/IJEDR1501039.pdf

Issue

Volume 3 Issue 1 

Pages. 202-207

Article Preview