Problem
After a Datameer restart, the first job fails with the following error (if it's a Spark job):
Error: UnknownHostException: mycluster
The issue doesn't occur for Tez jobs or any ongoing execution, regardless of its engine.
Cause
The root cause is currently under investigation by the Datameer engineering team. The internal ticket number is DAP-32386.
Workaround
It is confirmed that the issue only occurs on environments that have enabled HDFS High Availability as well as having the first job executed after Datameer restarts using the Spark engine. If you face such issue, restart impacted job.
Please get in touch with Datameer support if the mentioned workaround doesn't help.
Comments
0 comments
Please sign in to leave a comment.