How to increase concurrent Datameer jobs in cluster

Comments

1 comment

  • Konsta Danyliuk

    Hello Suhel,

    [JobScheduler worker1-thread-511] (MrPlanRunnerV2.java:81) - Allow running Datameer job with up to 1 concurrent cluster jobs.

    This message shows how many map/reduce jobs will be concurrently started at the cluster to calculate certain execution submitted by Datameer (e.g. a Workbook or an ImportJob).

    This value is controlled by the property das.job.concurrent-mr-jobs.new-graph-api. You could check this property at job-cont-cluster.xmlfile for a particular job.

    Default values

    • For execution framework Local das.job.concurrent-mr-jobs.new-graph-api=1
    • If Datameer is connected to a cluster das.job.concurrent-mr-jobs.new-graph-api=5

    Amount of concurrent map/reduce jobs depends on file size, execution engine, calculated splits, manual setting of splits and how many mappers and reducer the cluster allows to run in parallel. But most likely on calculated splits, rafly speaking - more splits theoretically lead to more concurrent map/reduce jobs. This is being calculated by internal algorithm on job compiling.

    Concerning Scenario 1

    Amount of concurrent job submitted by Datameer to a cluster is controlled by the option Max Concurrent Jobs. It could be set in HadoopCluster section at Admin tab. Default value is 25, this means that if you'll start 30 jobs at the same time, only first 25 will be started and sent to a cluster, the rest 5 will wait in queued state until any this first jobs is completed.

     

    0
    Comment actions Permalink

Please sign in to leave a comment.