Problem
When Datameer is configured with an auto cluster configuration, jobs fail to run on the Spark and Tez execution frameworks. However, jobs executed on MapReduce complete successfully.
When Datameer's Hadoop Cluster is manually configured, all jobs fail regardless of execution framework.
The following error is seen in the job log:
INFO [2016-12-06 17:39:20.017] [MrPlanRunnerV2] (Logging.scala:58) - Application report for application_APPLICATIONID (state: FAILED) INFO [2016-12-06 17:39:20.018] [MrPlanRunnerV2] (Logging.scala:58) - client token: N/A diagnostics: Application application_APPLICATIONID failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_CONTAINERID exited with exitCode: 1 For more detailed output, check the application tracking page: http://HOSTNAME:PORT/cluster/app/application_APPLICATIONID Then click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_CONTAINERID Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:933) at org.apache.hadoop.util.Shell.run(Shell.java:844) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:225) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1
Cause
Ambari creates symlinks within the filesystem to resolve the location of Hadoop. These symlinks are present in the YARN application classpath and was used for the Datameer configuration. Symlinks cannot be resolved by remote configurations.
The path to the hadoop/conf folder appears as the following within the YARN application classpath:
/usr/hdp/current/hadoop/conf
Where 'current' is a symlink instead of the real path.
Solution
Correct the YARN application classpath on the Hadoop Cluster page within the Admin tab.
Use either the fully qualified real paths, or use the hdp.version variable.
Example 1: Using the real path
/usr/hdp/2.5.0.0-1245/hadoop/conf:/usr/hdp/2.5.0.0-1245/ hadoop/conf:/usr/hdp/2.5.0.0- 1245/hadoop/conf:/usr/hdp/2.5. 0.0-1245/hadoop/lib/*:/usr/ hdp/2.5.0.0-1245/hadoop/*:/ usr/hdp/2.5.0.0-1245/hadoop- hdfs/:/usr/hdp/2.5.0.0-1245/ hadoop-hdfs/lib/*:/usr/hdp/2. 5.0.0-1245/hadoop-hdfs/*:/usr/ hdp/2.5.0.0-1245/hadoop-yarn/ lib/*:/usr/hdp/2.5.0.0-1245/ hadoop-yarn/*:/usr/hdp/2.5.0. 0-1245/hadoop-mapreduce/lib/*: /usr/hdp/2.5.0.0-1245/hadoop- mapreduce/*:/usr/hdp/2.5.0.0- 1245/hadoop-yarn-client/*:/ usr/hdp/2.5.0.0-1245/hadoop- yarn-client/lib/*:/usr/share/ java/slf4j-simple.jar
Example 2: Using the hdp.version variable
/usr/hdp/${hdp.version}/hadoop/conf:/usr/hdp/${hdp. version}/hadoop/conf:/usr/hdp/ ${hdp.version}/hadoop/conf:/ usr/hdp/${hdp.version}/hadoop/ lib/*:/usr/hdp/${hdp.version}/ hadoop/*:/usr/hdp/${hdp. version}/hadoop-hdfs/:/usr/ hdp/${hdp.version}/hadoop- hdfs/lib/*:/usr/hdp/${hdp. version}/hadoop-hdfs/*:/usr/ hdp/${hdp.version}/hadoop- yarn/lib/*:/usr/hdp/${hdp. version}/hadoop-yarn/*:/usr/ hdp/${hdp.version}/hadoop- mapreduce/lib/*:/usr/hdp/${ hdp.version}/hadoop-mapreduce/ *:/usr/hdp/${hdp.version}/ hadoop-yarn-client/*:/usr/hdp/ ${hdp.version}/hadoop-yarn- client/lib/*:/usr/share/java/ slf4j-simple.jar
Note: Make sure the hdp.version variable is uncommented and correctly set on the Hadoop Distribution Specific Properties section of the Hadoop Cluster page.
Comments
0 comments
Please sign in to leave a comment.