This is the documentation for Cloudera Enterprise 5.8.x. Documentation for other versions is available at Cloudera Documentation.

MapReduce 2.0 (YARN)

MapReduce has undergone a complete overhaul and CDH 5 now includes MapReduce 2.0 (MRv2). The fundamental idea of MRv2's YARN architecture is to split up the two primary responsibilities of the JobTracker — resource management and job scheduling/monitoring — into separate daemons: a global ResourceManager (RM) and per-application ApplicationMasters (AM). With MRv2, the ResourceManager (RM) and per-host NodeManagers (NM), form the data-computation framework. The ResourceManager service effectively replaces the functions of the JobTracker, and NodeManagers run on worker hosts instead of TaskTracker daemons. The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to run and monitor the tasks. For details of the new architecture, see Apache Hadoop NextGen MapReduce (YARN).

See also Migrating from MapReduce (MRv1) to MapReduce (MRv2).

  Important:

For installations in pseudo-distributed mode, there are separate conf-pseudo packages for an installation that includes MRv1 (hadoop-0.20-conf-pseudo) or an installation that includes YARN (hadoop-conf-pseudo). Only one conf-pseudo package can be installed at a time: if you want to change from one to the other, you must uninstall the one currently installed.

Page generated July 8, 2016.