How to Interpret the Contents of a Job Trace

Goal

Understand the individual files captured by a job trace from the perspective of the job-trace-creation.log, which is part of a job trace.

Learn

While a job is executing, the job-trace-creation.log is created to track logging activity. This log gives an overall view of what is successfully captured.

JOB_LOG

job.log - Logging job execution activity. This file is of the most interest to the support team.

JOB_PLAN_ORIGINAL

job-plan-original.dot - Original definition of the job which is sent to the Hadoop Cluster.

JOB_PLAN_COMPILED

job-plan-complied.dot - Modified definition of the job which is sent to the Hadoop Cluster. This file contains the reordering of job sequences to be processed, by understanding the dependencies.

JOB_DEFINITION

job-definition.json - Job definition in JSON format, compatible with the Datameer REST API.

JOB_INPUT_DEFINITION

job-definition-<xxx>.json - xxx denotes the original file name of the job which is incorporated as a part of this file name. This file defines job specifics in JSON format.

JOB_CONF

job-conf.xml - Job configuration used when running jobs locally.

JOB_CONF_CLUSTER

job-conf-cluster.xml - When the execution framework is Tez or SparkClient, this file is logged. It merges the Datameer configuration with the Hadoop configuration.

TASK_LOGS

tasklog-spark-submit.log - When the execution framework is SparkCluster, this file is logged and it contains an account of all activities for the tasks executed on this particular job.

ERROR_LOGS

When there are exceptions, a different error log file with a different name is created.

Example:

error-map-local-<numbering>.log.gz error-map-attempt-<timestamp>.log.gz

Articles in this section

Comments

Articles in this section

Related articles