With the help of YARN arbitrary applications can be executed on a Hadoop cluster. Therefore, the application has to consist of one application master and an arbitrary number of containers. Latter are responsible for the execution of the application whereas the application master requests container and monitors their progress and status.
In order to execute these applications, YARN consists of two component types:
- The ResourceManager is unique for a complete cluster. Its main task is granting the requested resources and balancing the load of the cluster. Furthermore, it starts the application master initially and restarts it in case of a failure.
- On each computing node, one NodeManager is executed. It starts and monitors the containers assigned to it as well as the usage of its resources, i.e., CPU usage and memory consumption.
- YARN - documentation
- V. K. Vavilapalli and A. C. Murthy and C. Douglas and S. Agarwal and M. Konar and R. Evans and T. Graves and J. Lowe and H. Shah and S. Seth and B. Saha and C. Curino and O. O'Malley and S. Radia and B. Reed and E. Baldeschwieler "Apache Hadoop YARN: Yet Another Resource Negotiator" Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC '13), 2013, Pages 5:1-5:16, ACM New York, NY, USA
The following blog entries were written by Arun C. Murthy who was the leading developer of YARN.