Jump to content

Big Data/YARN

From Wikiversity

YARN (Yet Another Resource Negotiator) is a cluster management system. It has been part of Apache Hadoop since v2.0.

With the help of YARN arbitrary applications can be executed on a Hadoop cluster. Therefore, the application has to consist of one application master and an arbitrary number of containers. Latter are responsible for the execution of the application whereas the application master requests container and monitors their progress and status.

In order to execute these applications, YARN consists of two component types:

  1. The ResourceManager is unique for a complete cluster. Its main task is granting the requested resources and balancing the load of the cluster. Furthermore, it starts the application master initially and restarts it in case of a failure.
  2. On each computing node, one NodeManager is executed. It starts and monitors the containers assigned to it as well as the usage of its resources, i.e., CPU usage and memory consumption.

References

[edit | edit source]
  • YARN - documentation
  • V. K. Vavilapalli and A. C. Murthy and C. Douglas and S. Agarwal and M. Konar and R. Evans and T. Graves and J. Lowe and H. Shah and S. Seth and B. Saha and C. Curino and O. O'Malley and S. Radia and B. Reed and E. Baldeschwieler "Apache Hadoop YARN: Yet Another Resource Negotiator" Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC '13), 2013, Pages 5:1-5:16, ACM New York, NY, USA

The following blog entries were written by Arun C. Murthy who was the leading developer of YARN.