Problem
When executing a workbook, it fails. If the job is run in the Tez framework, this same error is included in the syslog_dag for the Application Master:
2015-01-01 00:00:00,000 INFO [AsyncDispatcher event handler] impl.DAGImpl: Exception in committing output: output java.lang.RuntimeException: Unable to merge data handle 'd653e5b2-4497-46f7-938c-446425430068' with description 'Sorted optimized sheet preview for my_worksheet' to target path 'my_worksheet/optimized_preview' at datameer.dap.common.graphv2.hadoop.MrJobOutputCommitter.mergeOutputRecordSources(MrJobOutputCommitter.java:226) at datameer.plugin.tez.output.TezJobOutputCommitter.commitOutput(TezJobOutputCommitter.java:44) at org.apache.tez.dag.app.dag.impl.DAGImpl$1.run(DAGImpl.java:804) at org.apache.tez.dag.app.dag.impl.DAGImpl$1.run(DAGImpl.java:801) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.dag.app.dag.impl.DAGImpl.commitOutput(DAGImpl.java:801) at org.apache.tez.dag.app.dag.impl.DAGImpl.commitOrAbortOutputs(DAGImpl.java:887) at org.apache.tez.dag.app.dag.impl.DAGImpl.finished(DAGImpl.java:1133) at org.apache.tez.dag.app.dag.impl.DAGImpl.checkDAGForCompletion(DAGImpl.java:1056) at org.apache.tez.dag.app.dag.impl.DAGImpl$VertexCompletedTransition.transition(DAGImpl.java:1719) at org.apache.tez.dag.app.dag.impl.DAGImpl$VertexCompletedTransition.transition(DAGImpl.java:1676) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:952) at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:129) at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1695) at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1686) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: Wrong FS: file:/data/hadoop/yarn/local/merge-sort-0-b934c75b-bbbf-40c3-9567-8b6361d7fa1127769014602445750.tmp, expected: hdfs://datanode_001.hadoop.datameer.com:8020 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105) at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118) at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114) at datameer.dap.sdk.cluster.filesystem.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:181) at datameer.dap.common.data.DatameerFile$1.get(DatameerFile.java:241) at datameer.dap.common.data.DatameerFile$1.get(DatameerFile.java:197) at datameer.dap.sdk.sequence.Sequence$23.moveToNext(Sequence.java:1140) at datameer.dap.sdk.sequence.Sequence$15.computeNext(Sequence.java:669) at datameer.dap.sdk.sequence.Sequence$Simple.moveToNext(Sequence.java:157) at datameer.dap.common.graphv2.hadoop.MrJobOutputCommitter.mergeSingleArtifactOutput(MrJobOutputCommitter.java:298) at datameer.dap.common.graphv2.hadoop.MrJobOutputCommitter.mergeOutputRecordSource(MrJobOutputCommitter.java:264) at datameer.dap.common.graphv2.hadoop.MrJobOutputCommitter.mergeOutputRecordSources(MrJobOutputCommitter.java:223) ... 23 more
If the job is run in the MapReduce framework, a similar stacktrace is generated directly in the job log.
Cause
This is a known bug in Datameer.
Solution
This issue may be worked around by adding the following two custom properties to the affected workbook:
das.merge-sort.max-file-size=0 das.execution-framework=MapReduce
This issue is resolved in Datameer 5.6.7 and 5.7.2 and later releases of Datameer.
For more information, please contact Datameer technical Support.
Comments
0 comments
Please sign in to leave a comment.