Should I install hadoop mr1 or mr2?
I'm looking into installing a new cluster and would like to know the advantages of having mr2 over mr1. Does it make a difference with hadoop applications like Datameer?
-
Official comment
I'd say that for a new cluster, it is a no brainer to start with MR2 unless you have legacy applications that depend on MR1. MR2 architecture is quite different now allowing more flexibility and better cluster resource management.
Most importantly, there are some cool frameworks being developed for MR2 that would allow things like faster job execution that you would miss if you start with MR1
Comment actions -
Datameer supports MR1 as well as MR2.
Hadoop 2 Apache separated the management of the map and reduce process from the cluster's resource management; you now have a YARN=yet another resource manager. This separation allows a better versatility - it can support the MR and also support additional paradigms (Tez, Storm, etc).
MR2 has been observed to be more efficient when compared to MR1. Hortonworks has documented this well at: http://hortonworks.com/hadoop/yarn/
Please sign in to leave a comment.
Comments
2 comments