What is the recommended number of tasks per node in Hadoop?
Is there a recommended/optimal configuration for number of tasks (map/reduce combined) for cluster powered by Datameer? Or should we consider only our hadoop distribution recommendations?
-
Official comment
While there is no real formula to use, there are existing discussion related to this issue. Here is one with some discussion: http://stackoverflow.com/questions/10031204/how-many-mappers-reducers-should-be-set-when-configuring-hadoop-cluster
Also our partner has some good information about improving performance. You can read it here: http://blog.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/
Comment actions
Please sign in to leave a comment.
Comments
1 comment