Question
According to Datameer's Checklist for Hadoop Administrators, speculative execution is disabled for all Datameer jobs. Is there a way to enable speculative execution?
SolutionNo, Datameer does not support speculative execution for a number of reasons:
- A well tuned cluster will not need speculative execution. It's a feature designed to get more consistent performance out of a cluster with badly performing nodes. Datameer's stance on this is to correct the problem causing nodes to perform poorly.
- Duplication of tasks by speculative execution causes more load on the cluster - reducing overall cluster performance/throughput.
- With the duplication of tasks, it's possible that they will finish in a close enough time to cause a race condition - where Datameer receives two sets of data from the cluster. This could lead to wrong data.
- Duplication of tasks could cause duplication of logs being stored within Datameer, increasing disk consumption.
- Duplication of tasks could cause false positives causing a job to appear as if it failed - when the 'backup' task completed successfully - making it appear as a job failed or completed with warnings instead of the real status of completed successfully.
Generally features that are designed to mitigate symptoms of other problems are not supported. It is Datameer's stance to address the root cause of a problem rather than addressing symptoms of a problem.
Comments
0 comments
Please sign in to leave a comment.