Problem
Jobs in HDP 2.0.x fail with:
Caused by: java.io.IOException: Max block location exceeded for split: AliasSplit{delkickback_dailypointstrx_from_hadoop.preOperations, GenericCombinedSplit{count=28, totalSize=2982467418, locations=[hadoop-0208-u15, hadoop-0208-u17, hadoop-0207-u15, hadoop-0208-u27, hadoop-0207-u25, hadoop-0207-u29, hadoop-0207-u07, hadoop-0207-u03, hadoop-0207-u05, hadoop-0207-u11, hadoop-0208-u21, hadoop-0208-u09, hadoop-0207-u01, hadoop-0207-u27, hadoop-0208-u13, hadoop-0207-u09, hadoop-0208-u05, hadoop-0208-u23, hadoop-0207-u17, hadoop-0207-u23, hadoop-0208-u03, hadoop-0207-u21, hadoop-0208-u11, hadoop-0208-u01, hadoop-0208-u25], locationLocalities=[22, 22, 21, 21, 18, 18, 17, 13, 13, 13, 11, 10, 10, 9, 9, 9, 9, 8, 6, 6, 6, 6, 4, 4, 4]}} splitsize: 25 maxsize: 10 at org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:162) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:87) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:540) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at datameer.dap.common.job.mr.HadoopMrJobClient.submitJob(HadoopMrJobClient.java:214) at datameer.dap.common.job.mr.HadoopMrJobClient.runJobImpl(HadoopMrJobClient.java:55) at datameer.dap.common.job.mr.MrJobClient.runJob(MrJobClient.java:32) at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.runJob(MrJobExecutor.java:40) at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.doExecute(MrJobExecutor.java:33) at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.doExecute(MrJobExecutor.java:18) at datameer.dap.common.job.mr.plan.execution.NodeExecutor.execute(NodeExecutor.java:28) ... 7 more |
Cause
Although exact cause is unknown, this was only observed with one customer when they were using HDP 2.0.6. After upgrading to HDP 2.1.2, this was no longer an issue.
When troubleshooting with Hortonworks Support resource, they mentioned that the default configuration for mapreduce.job.max.split.locations was too low causing this issue.
Solution
As a workaround set in custom hadoop properties mapreduce.job.max.split.locations=<number of data notes> + 10%. The additional 10% is to allow for new nodes as cluster grows.
Comments
0 comments
Please sign in to leave a comment.