MapReduce framework setting DM version 6.1.8
Dear,
Our Hadoop cluster HDP (2.4) has default Mapreduce as excecution framework (due to other applications also running on HDP).
Moddifying the default framework form TEZ to Mapreduce works, but creates warnings with running importjobs, workbooks etc.
Is there an option to 'filter' these warnings?
When every job creates a warning, nobody will look into them and potentialy miss the warnings that are not related to the 'TEZ' warning
best
mattijs
-
Hello Mattijs,
Default execution engine for Datameer 6.x is Tez. MapReduce still works, but it is marked as deprecated in Datameer 6.1.x. We plan completely remove it in future major release, to let our customers work with more powerful and convenient Tez and Spark.In order to warn users that MapReduce engine is deprecated in Datameer 6, we've introduced below warnings. There is no way to switch this notification OFF.
WARN [2017-02-19 16:38:53.007] [JobScheduler thread-1] (JobScheduler.java:448) - ============================================================
WARN [2017-02-19 16:38:53.007] [JobScheduler thread-1] (JobScheduler.java:449) - == Deprecation warning: This job is running on MapReduce, which got deprecated with Datameer 6.1
WARN [2017-02-19 16:38:53.007] [JobScheduler thread-1] (JobScheduler.java:450) - == and will be removed in future versions of Datameer. Please remove the property
WARN [2017-02-19 16:38:53.007] [JobScheduler thread-1] (JobScheduler.java:451) - == 'das.execution-framework=MapReduce' from your Hadoop configuration to run on a current execution framework.
WARN [2017-04-19 16:38:53.008] [JobScheduler thread-1] (JobScheduler.java:452) - ============================================================Do you face any problem with Tez engine in your environment or are there any restrictions to use MapReduce only?
-
Hallo Konsta,
Thanks for quick reply.
What I would expet from DM howerever is that it would continue to support the execution frameworks in place with the HDP distrbutions.
IMHO: it would have been enough to mention it only the first time a job is scheduled. Like mentioned one tends to ignore default warnings and one can miss out on the important functional warnings. But your answer is clear, thanks for that.
Reason why we use Mapreduce instead of TEZ has to do with (Python) models running on our cluster; we find that the results with TEZ are unreliable, so we aboonded TEZ and switched back to Mapreduce.
Best
Mattijs .
-
Mattijs,
I understand your concern regarding MR vs Tez for your existing jobs.
Datameer is just a job compiler and it doesn't matter what execution framework will you choose MR, Tez or Spark - it will compile a job accordingly, including all required libs and send it to cluster for execution. There might be differences in job performance (execution time) among different engines, which most likely depends on data volume and cluster configuration (e.g. resource distribution or security constraints), but as soon a job is completed you will get your results.
-
Hi Mattijs,
Sorry to hear the transition to 6.1 hasn't been smooth for you.
Unfortunately, in 6.1 the MapReduce framework was deprecated. This is why the warnings are present. It is not possible to disable them.
The correct solution is to use Tez as the execution framework. Tez is more performant than MapReduce, so it's a win-win for everyone at the end of the day.
Alan
Please sign in to leave a comment.
Comments
4 comments