Problem
MapReduce jobs are not executed and fail with the following error message:
ERROR [2016-05-24 12:04:29.285] [ConcurrentJobExecutor-0] (ClusterSession.java:198) - Failed to run cluster job 'Workbook job (110): TEST_JOB#Sheet1(Group by operation)' [10 sec]
java.lang.RuntimeException: Failed to run job : Failed to renew token: Kind: MR_DELEGATION_TOKEN, Service: 10.1.1.12:10020, Ident: (owner=datameer@DM.COM, renewer=yarn, realUser=, issueDate=1464105869043, maxDate=1464710669043, sequenceNumber=2, masterKeyId=2)
at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:49)
at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:31)
at datameer.dap.common.graphv2.hadoop.MrJob.runImpl(MrJob.java:228)
at datameer.dap.common.graphv2.ClusterJob.run(ClusterJob.java:128)
at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:184)
at datameer.dap.common.graphv2.ConcurrentClusterSession$1.run(ConcurrentClusterSession.java:48)
at datameer.dap.common.security.DatameerSecurityService$1.call(DatameerSecurityService.java:151)
at datameer.dap.common.security.DatameerSecurityService$1.call(DatameerSecurityService.java:145)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to run job : Failed to renew token: Kind: MR_DELEGATION_TOKEN, Service: 10.1.1.12:10020, Ident: (owner=datameer@DM.COM, renewer=yarn, realUser=, issueDate=1464105869043, maxDate=1464710669043, sequenceNumber=2, masterKeyId=2)
Cause
This failure was observed with Datameer running against a multi-homed cluster. By default the service field of a delegation token is populated based on the server IP address.
Setting hadoop.security.token.service.use_ip=false
changes this behavior to use the host name instead of the IP address.
However, this configuration property in not read from job.xml
(see MAPREDUCE-6565 for background information). According to this ticket there is a Hadoop class which creates an Configuration
object using new Configuration()
and expects *-site.xml
in the classpath of the client (here Datameer) in a static block.
It looks something like this:
org.apache.hadoop.security.SecurityUtil.java
static {
Configuration conf = new Configuration();
boolean useIp = conf.getBoolean(
CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP,
CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP_DEFAULT);
setTokenServiceUseIp(useIp);
}
Because of this piece of code, the value of hadoop.security.token.service.use_ip
passed through the client programatically is not respected and it uses the default value i.e. true
.
Solution
- Add
*-site.xml
from the cluster in Datameer's class path (possibly underetc/custom-jars
) - Set the value of
hadoop.security.token.service.use_ip=true
everywhere including the cluster. For example, let the cluster use ip instead of hostname (not a recommended solution).
Comments
0 comments
Please sign in to leave a comment.