Question
I would like to set the number of Job Scheduler and Event Bus threads to better reflect a specific environment's needs. What parameters should I consider?
Answer
Datameer leverages an Event Bus to handle multiple types of tasks. As of Datameer 7.3, there are the following types of EventBus queues:
- Main - multi-thread stack for events management.
- Email notification - dedicated single thread queue to handle email notification events.
- JobScheduler - dedicated pool to manage data-driven chains triggering.
Pool size best practices
Email Notification
- Hardcoded with a single thread which is enough to manage all the events it gets.
JobScheduler
- 4 threads by default.
- The number of threads could be changed at the JobScheduler section of the Admin Tab.
- As it only takes up to 2 seconds to process a data-driven artifact triggering event, 4 threads are enough to trigger 10 jobs in 5 seconds. It is unlikely that more than 10 artifacts that have data-driven downstream dependencies succeed at the same time, thereby 4 threads are sufficient for most Datameer instances.
- In case there are data-driven chains where a single artifact triggest 30-50 executions and Datameer is configured to execute 50+ jobs concurrently, it makes sense to increase JobScheduler threads to 6 or 8.
Main
- 16 threads by default
- The number of threads is configurable via the property
event.bus.async.threads
stored in/<Datameer installation folder>/conf/default.properties
file. - One might consider increasing this pool size in case the Datameer instance is heavily loaded by users' actions or REST API calls execution.
- For example. An HDFS permission change is a quite expensive operation and the situation, when one changes ownership/ group permission for an ImportJob that frequently runs in append mode (which means a lot of files and folders in HDFS), will cause a thread occupied by this task for some time. The same happens when one executes a series of REST API calls that modify artifacts' ownerships or permission level. Such actions might create a temporary shortage of resources for the Main EventBus queue.
Depends on the overall Datameer JVM memory consumption one might want to slightly increase it's heap size while changing the amount of EventBus threads.
Please get in touch with Datameer support in case of any further questions on Event Bus threads evaluation.
Comments
0 comments
Please sign in to leave a comment.