datameer job failed on cluster
Completed
INFO [2019-12-31 08:35:26.557] [JobScheduler thread-1] (JobScheduler.java:434) - Starting job 1042387 (DAS Version: 7.2.10, Revision: 5cc065608bdebcb190d2aa0b6f4d0d784377ad6b, Hadoop-Distribution: 2.6.0-cdh5.16.1 (cdh-5.16.1), JVM: 1.8)
INFO [2019-12-31 08:35:26.561] [JobScheduler thread-1] (NormalJobDriver.java:139) - Checking if JobExecutionValueObject{_id=1042387} can be started
INFO [2019-12-31 08:35:26.589] [JobScheduler thread-1] (JobScheduler.java:472) - [Job 1042387] Preparing job in job scheduler thread for WorkbookConfigurationImpl{id=979}...
INFO [2019-12-31 08:35:26.589] [JobScheduler thread-1] (JobScheduler.java:475) - [Job 1042387] Preparing job in job scheduler thread for WorkbookConfigurationImpl{id=979}... done (0 sec)
INFO [2019-12-31 08:35:26.775] [JobScheduler worker5-thread-45] (JobSchedulerJob.java:96) - [Job 1042387] Preparing job for WorkbookConfigurationImpl{id=979}...
INFO [2019-12-31 08:35:27.493] [JobScheduler worker5-thread-45] (JobSchedulerJob.java:101) - [Job 1042387] Preparing job for WorkbookConfigurationImpl{id=979}... done (0 sec)
INFO [2019-12-31 08:35:27.572] [JobScheduler worker5-thread-45] (JobSchedulerJob.java:116) - Starting job ...
INFO [2019-12-31 08:35:27.588] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'PUSH_BODY' (keep=false) with datameer.dap.common.sheet.PartitionedLinkedDataSheetBuilder@3b26376
INFO [2019-12-31 08:35:27.591] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'SEND' (keep=false) with datameer.dap.common.sheet.PartitionedLinkedDataSheetBuilder@8d994c3
INFO [2019-12-31 08:35:27.591] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'Campaign_ids' (keep=false) with datameer.dap.common.formula.aggregation.AggregationSheetBuilder@521add6d
INFO [2019-12-31 08:35:27.592] [JobScheduler worker5-thread-45] (AggregationSheetBuilder.java:106) - Aggregation steps for sheet 'Campaign_ids':
Step 1 (inner formulas):
Operations: GROUPBY(#input!app_key), GROUPBY(#input!body_campaign_id), #input!is_cleaning_campaign, #input!event_offset, #input!is_silent_campaign, #input!body_resource
Step 2 (partitioning):
Operations: { #input!A, #input!B }, { #input!C, #input!D, #input!E, #input!F }
Step 3a1: (aggregate to intermediates)
Operations: { #key!primaryA, #key!primaryB }, { GROUPLAST{toIntermediate}(#value!A;#value!B), GROUPLAST{toIntermediate}(#value!C;#value!B), GROUPLAST{toIntermediate}(#value!D;#value!B) }
Step 3ab: (aggregate from intermediates)
Operations: #key!primaryA, #key!primaryB, GROUPLAST{fromIntermediate}(#value!A), GROUPLAST{fromIntermediate}(#value!B), GROUPLAST{fromIntermediate}(#value!C)
Step 4 (outer formulas):
Operations: #input!A, #input!B, #input!C, #input!D, #input!E
Execution order: 0, 1, 2, 3, 4
INFO [2019-12-31 08:35:27.592] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'Format_date' (keep=false) with FormulaSheetBuilder{Format_date, source-sheet=SEND, expressions=ExpressionContext[column='app_key', id='0', index=0, expression=#SEND!app_key], ExpressionContext[column='event_id', id='1', index=1, expression=#SEND!event_id], ExpressionContext[column='event_offset', id='2', index=2, expression=#SEND!event_offset], ...}
INFO [2019-12-31 08:35:27.593] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'SEND_join_PUSH_BODY' (keep=false) with JoinSheetBuilder{SEND_join_PUSH_BODY, category=TWO_MEMBER_JOIN, join-couplings=[JoinStep{[JoinSource{Format_date, [#Format_date!app_key, #Format_date!body_campaign_id], includes=[#Format_date!app_key, #Format_date!event_id, #Format_date!event_offset, #Format_date!event_occurred, #Format_date!event_processed, #Format_date!device_named_user_id, #Format_date!body_push_id, #Format_date!body_group_id, #Format_date!channel_id, #Format_date!platform, #Format_date!body_campaign_id]}, JoinSource{Campaign_ids, [#Campaign_ids!app_key, #Campaign_ids!body_campaign_id], includes=[#Campaign_ids!is_cleaning_campaign, #Campaign_ids!is_silent_campaign, #Campaign_ids!body_resource]}]}]}
INFO [2019-12-31 08:35:27.593] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'app_info' (keep=false) with datameer.dap.common.sheet.StaticDataSheetBuilder@5cfb0cdd
INFO [2019-12-31 08:35:27.593] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'GroupByDay' (keep=false) with datameer.dap.common.formula.aggregation.AggregationSheetBuilder@4cc26de7
INFO [2019-12-31 08:35:27.594] [JobScheduler worker5-thread-45] (AggregationSheetBuilder.java:106) - Aggregation steps for sheet 'GroupByDay':
Step 1 (inner formulas):
Operations: GROUPBY(#input!app_key), GROUPBY(ASDATE(#input!event_occurred)), GROUPBY(#input!is_cleaning_campaign), GROUPBY(#input!is_silent_campaign), GROUPBY(#input!body_resource)
Step 2 (partitioning):
Operations: { #input!A, #input!B, #input!C, #input!D, #input!E }, { }
Step 3a1: (aggregate to intermediates)
Operations: { #key!primaryA, #key!primaryB, #key!primaryC, #key!primaryD, #key!primaryE }, { GROUPCOUNT{toIntermediate}() }
Step 3ab: (aggregate from intermediates)
Operations: #key!primaryA, #key!primaryB, #key!primaryC, #key!primaryD, #key!primaryE, GROUPCOUNT{fromIntermediate}(#value!A)
Step 4 (outer formulas):
Operations: #input!A, #input!B, #input!C, #input!D, #input!E, #input!F
Execution order: 0, 1, 2, 3, 4, 5
INFO [2019-12-31 08:35:27.594] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'full_app_information' (keep=false) with JoinSheetBuilder{full_app_information, category=TWO_MEMBER_JOIN, join-couplings=[JoinStep{[JoinSource{GroupByDay, [#GroupByDay!app_key], includes=[#GroupByDay!app_key, #GroupByDay!event_occurred, #GroupByDay!is_cleaning_campaign, #GroupByDay!is_silent_campaign, #GroupByDay!body_resource, #GroupByDay!num_of_sends]}, JoinSource{app_info, [#app_info!app_key], includes=[#app_info!app_name, #app_info!opco_code, #app_info!env_name]}]}]}
INFO [2019-12-31 08:35:27.595] [JobScheduler worker5-thread-45] (WorkbookJob.java:260) - Registering operations for sheet 'KPI_Push_Sends_daily' (keep=true) with FormulaSheetBuilder{KPI_Push_Sends_daily, source-sheet=full_app_information, expressions=ExpressionContext[column='app_key', id='0', index=0, expression=#full_app_information!app_key], ExpressionContext[column='event_occurred', id='5', index=1, expression=#full_app_information!event_occurred], ExpressionContext[column='is_cleaning_campaign', id='6', index=2, expression=#full_app_information!is_cleaning_campaign], ...}
INFO [2019-12-31 08:35:28.064] [JobExecutionPlanRunner] (JobExecutionTraceService.java:85) - Creating local job execution trace log at /opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/temp/cache/dfscache/local-job-execution-traces/1042387
INFO [2019-12-31 08:35:28.067] [JobExecutionPlanRunner] (DeferredJoinConnector.java:99) - Selected join strategy MEMORY_BACKED_MAP_SIDE (disabled: []) with small input: sheet=app_info, compressed=5.3 KB, uncompressed=6.0 KB
INFO [2019-12-31 08:35:28.067] [JobExecutionPlanRunner] (DisconnectedRecordStream.java:60) - Disconnected stream DisconnectedRecordStream{sheetName=full_app_information, description=Disconnected record stream} got connected to its inputs.
INFO [2019-12-31 08:35:28.067] [JobExecutionPlanRunner] (DeferredJoinConnector.java:109) - In order to find join source that fits in memory, we have to compute the results of 'Format_date' first.
INFO [2019-12-31 08:35:28.067] [JobExecutionPlanRunner] (DisconnectedRecordStream.java:64) - Cannot connect disconnected stream DisconnectedRecordStream{sheetName=SEND_join_PUSH_BODY, description=Disconnected record stream} to its inputs. Have to compute [RecordStream{sheetName=Format_date, description=Expression record processor}] first.
INFO [2019-12-31 08:35:28.068] [JobExecutionPlanRunner] (TezClusterSession.java:43) - Creating a TEZ job for session Workbook job (1042387): KPI_Push_Sends with a job count 1
INFO [2019-12-31 08:35:28.068] [JobExecutionPlanRunner] (ClusterJobFlow.java:149) - Created configuration for StageGraphClusterJobFlow{stages=[Stage{input=ExternalInputConnector{}, streams=[RecordStream{sheetName=SEND, description=Keep columns 0,1,2,3,4,5,6,7,8,9,10}, RecordStream{sheetName=Format_date, description=Expression record processor}]}]}: ClusterJobConfiguration{}
INFO [2019-12-31 08:35:28.068] [JobExecutionPlanRunner] (ClusterSession.java:167) - -------------------------------------------
INFO [2019-12-31 08:35:28.068] [JobExecutionPlanRunner] (ClusterSession.java:168) - Running cluster job (TEZ) for 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor)'
INFO [2019-12-31 08:35:28.069] [JobExecutionPlanRunner] (ClusterSession.java:169) - ClusterMetadata{hdfsBlockSize=134217728, memoryInMb=241664, vCoreCount=56, workerNodeCount=26, queueInfo={root.datameer=0.36158192, root.jboss=0.010593221, root.utilization=1.0, root.mmassett=1.0, root.msgplus=0.105932206, root.sbhaumik=1.0, root.vision=1.0, root.dkarnata=1.0, root.skarleka=1.0, root.sdc=1.0, root.vfstart=1.0, root.default=1.0}}
INFO [2019-12-31 08:35:28.069] [JobExecutionPlanRunner] (ClusterMetadata.java:83) - Datameer is using the default queue settings and assumes no resource limitations.
INFO [2019-12-31 08:35:28.069] [JobExecutionPlanRunner] (ClusterSession.java:175) - Using 18 vcores per node (das.yarn.available-node-vcores=18)
INFO [2019-12-31 08:35:28.069] [JobExecutionPlanRunner] (ClusterMetadata.java:83) - Datameer is using the default queue settings and assumes no resource limitations.
INFO [2019-12-31 08:35:28.069] [JobExecutionPlanRunner] (ClusterSession.java:178) - Using 75.0 MB memory per node (das.yarn.available-node-memory=75)
INFO [2019-12-31 08:35:28.069] [JobExecutionPlanRunner] (ClusterSession.java:184) - Output (intermediate): sheet=Format_date, description=Expression record processor (d39e33c2-b4ca-4f5c-b08d-e2d2151eb001)
INFO [2019-12-31 08:35:28.069] [JobExecutionPlanRunner] (ClusterSession.java:187) - -------------------------------------------
INFO [2019-12-31 08:35:28.073] [JobExecutionPlanRunner] (ClusterMetadata.java:83) - Datameer is using the default queue settings and assumes no resource limitations.
INFO [2019-12-31 08:35:28.073] [JobExecutionPlanRunner] (MrRecordConsumerProvider.java:90) - Registering consumer for: IntermediateArtifactData{targetPath=null, tempOutput=0c761b76-d346-45e7-a208-62128f7b6524, description=Intermediate output data for 'Format_date'}
INFO [2019-12-31 08:35:28.151] [JobExecutionPlanRunner] (TezJob.java:173) - Submitting DAG to Tez cluster with name:Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)
INFO [2019-12-31 08:35:28.152] [JobExecutionPlanRunner] (LightweightDasJobContext.java:75) - Synchronize global task local resources with remote hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.208] [JobExecutionPlanRunner] (LightweightDasJobContext.java:91) - Synchronize job-specific task local resources with remote hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.209] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource 'temp/tez-execution/plugin-tez-1577707754000.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.211] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/webapps/conductor/WEB-INF/lib/hadoop-mapreduce-client-core-2.6.0-cdh5.16.1.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.264] [JobExecutionPlanRunner] (TezSessionImpl.java:45) - Creating new TezClient...
INFO [2019-12-31 08:35:28.275] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/commons-collections4-4.1.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.278] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/RoaringBitmap-0.5.11.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.280] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/tez-api-0.8.5-dm2.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.283] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/tez-common-0.8.5-dm2.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.285] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/tez-dag-0.8.5-dm2.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.288] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/tez-runtime-internals-0.8.5-dm2.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.290] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/tez-runtime-library-0.8.5-dm2.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.293] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/tez-yarn-timeline-history-with-acls-0.8.5.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.295] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/tez-yarn-timeline-history-0.8.5.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.298] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/hadoop-shim-0.8.5.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.300] [JobExecutionPlanRunner] (LightweightDasJobContext.java:117) - Synchronize additional task local resource '/opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/tmp/das-plugins6978322647035014776.folder/plugin-tez-7.2.10.zip/plugin-tez-7.2.10/lib/compile/hadoop-shim-2.6-0.8.5.jar' with remote filesystem hdfs://nameservice1/apps/datameer/jobjars
INFO [2019-12-31 08:35:28.315] [JobExecutionPlanRunner] (TezClient.java:212) - Tez Client Version: [ component=tez-api, version=0.8.5, revision=1775b894c79cb08acf40d0465167d2825c6c1b49, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=2017-11-06T23:07:57Z ]
INFO [2019-12-31 08:35:28.315] [JobExecutionPlanRunner] (TezClientFacade.java:332) - Starting Tez session ...
INFO [2019-12-31 08:35:28.325] [JobExecutionPlanRunner] (TezClient.java:455) - Session mode. Starting session.
INFO [2019-12-31 08:35:28.325] [JobExecutionPlanRunner] (TezClientUtils.java:176) - Using tez.lib.uris value from configuration: hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/commons-collections4-4.1.jar_45af6a8e5b51d5945de6c7411e290bd1.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/RoaringBitmap-0.5.11.jar_5598b28306a4480ad5c7debcdb516df2.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/tez-api-0.8.5-dm2.jar_aa081c5bc59126ddb7d5da095a88249d.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/tez-common-0.8.5-dm2.jar_8c40d59ff4c99faeaf696874c94e4ab7.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/tez-dag-0.8.5-dm2.jar_025df805196ce705ebc366d723b1fd39.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/tez-runtime-internals-0.8.5-dm2.jar_25745e82eed193d44743fdd983e55ffd.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/tez-runtime-library-0.8.5-dm2.jar_94905dbf2c6b363a07dbdb7a515fc807.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/tez-yarn-timeline-history-with-acls-0.8.5.jar_e61223ed428ea4d5252c91186cf7cc0c.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/tez-yarn-timeline-history-0.8.5.jar_62bd4ebebf59ad014fdc6103e15afb9a.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/hadoop-shim-0.8.5.jar_f22e34ae81da633517f37ddf20c05eca.jar,hdfs://nameservice1/apps/datameer/jobjars/7.2.10/tez-jars/hadoop-shim-2.6-0.8.5.jar_cf6cc3113629b8e11271fc72cbd2228e.jar
INFO [2019-12-31 08:35:28.325] [JobExecutionPlanRunner] (TezClientUtils.java:178) - Using tez.lib.uris.classpath value from configuration: null
INFO [2019-12-31 08:35:28.409] [JobExecutionPlanRunner] (ConfiguredRMFailoverProxyProvider.java:100) - Failing over to rm01
INFO [2019-12-31 08:35:28.417] [JobExecutionPlanRunner] (TezCommonUtils.java:122) - Tez system stage directory hdfs://nameservice1/apps/datameer/temp/job-1042387/.staging-f8528569-4a01-4938-b13c-221de5e19dd0/.tez/application_1569828507081_190296 doesn't exist and is created
INFO [2019-12-31 08:35:28.679] [JobExecutionPlanRunner] (YarnClientImpl.java:260) - Submitted application application_1569828507081_190296
INFO [2019-12-31 08:35:28.683] [JobExecutionPlanRunner] (TezClient.java:489) - The url to track the Tez Session: http://vghd02hr.dc-ratingen.de:8088/proxy/application_1569828507081_190296/
INFO [2019-12-31 08:35:28.683] [JobExecutionPlanRunner] (TezClientFacade.java:334) - Starting Tez session done
INFO [2019-12-31 08:35:28.683] [JobExecutionPlanRunner] (TezClientFacade.java:336) - Wait until Tez session ready (remaining attempts 2) ...
INFO [2019-12-31 08:35:43.819] [JobExecutionPlanRunner] (TezClientFacade.java:338) - Wait until Tez session ready done
INFO [2019-12-31 08:35:43.825] [JobExecutionPlanRunner] (DagRunner.java:63) - Submitting DAG 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)'.
INFO [2019-12-31 08:35:43.825] [JobExecutionPlanRunner] (TezClient.java:534) - Submitting dag to TezSession, sessionName=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20), applicationId=application_1569828507081_190296, dagName=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)
INFO [2019-12-31 08:35:44.422] [JobExecutionPlanRunner] (TezClient.java:630) - Submitted dag to TezSession, sessionName=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20), applicationId=application_1569828507081_190296, dagId=dag_1569828507081_190296_1, dagName=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)
INFO [2019-12-31 08:35:44.445] [JobExecutionPlanRunner] (DagRunner.java:65) - Submitted DAG 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)'.
INFO [2019-12-31 08:35:44.445] [JobExecutionPlanRunner] (DagRunner.java:113) - Waiting for DAG to finish: DAG name=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20), polling interval=500ms
INFO [2019-12-31 08:35:44.456] [JobExecutionPlanRunner] (ConfiguredRMFailoverProxyProvider.java:100) - Failing over to rm01
INFO [2019-12-31 08:35:46.300] [JobExecutionPlanRunner] (DagRunner.java:139) - DAG initialized: CurrentState=Running, DAG name=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)
INFO [2019-12-31 08:35:46.302] [JobExecutionPlanRunner] (DagRunner.java:187) - DAG status: state=RUNNING, progress=0%, Unknown task count, name=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)
INFO [2019-12-31 08:35:47.338] [JobExecutionPlanRunner] (DagRunner.java:187) - DAG status: state=FAILED, progress=0%, Unknown task count, name=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)
INFO [2019-12-31 08:35:47.338] [JobExecutionPlanRunner] (DagRunner.java:154) - Finished DAG 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)' (application=application_1569828507081_190296) with status=FAILED
INFO [2019-12-31 08:35:47.338] [JobExecutionPlanRunner] (DatameerTezUtils.java:32) - Tasks: succeeded=0, failed=0 for 'Map for sheets:[SEND, Format_date] (be275db2-e772-4f4a-86d5-63336eb85aa7)'
INFO [2019-12-31 08:35:47.338] [JobExecutionPlanRunner] (PoolingTezSessionFactory.java:55) - Returning TezSessionImpl{clientName=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20), applicationId=application_1569828507081_190296} to ReuseSessionFactory{source=AlwaysNewSessionFactory{}}.
INFO [2019-12-31 08:35:47.350] [JobExecutionPlanRunner] (TezJob.java:162) - Completed Tez job 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor)' with output path: hdfs://nameservice1/apps/datameer/temp/job-1042387/...
INFO [2019-12-31 08:35:47.350] [JobExecutionPlanRunner] (ClusterJob.java:116) - Tez Execution Framework completed cluster job 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor)' [19 sec]
ERROR [2019-12-31 08:35:47.351] [JobExecutionPlanRunner] (ClusterSession.java:217) - Failed to run cluster job 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor)' [19 sec]
datameer.com.google.common.base.VerifyException: Finished DAG 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)' (application_1569828507081_190296) with state FAILED and diagnostics: [Vertex failed, vertexName=Map for sheets:[SEND, Format_date] (be275db2-e772-4f4a-86d5-63336eb85aa7), vertexId=vertex_1569828507081_190296_1_00, diagnostics=[Vertex vertex_1569828507081_190296_1_00 [Map for sheets:[SEND, Format_date] (be275db2-e772-4f4a-86d5-63336eb85aa7)] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: Input:ExternalInputConnector{} initializer failed, vertex=vertex_1569828507081_190296_1_00 [Map for sheets:[SEND, Format_date] (be275db2-e772-4f4a-86d5-63336eb85aa7)], datameer.dap.sdk.importjob.NoInputPathFoundException: Could not find any existing file for
at datameer.dap.sdk.importjob.FileSplitter.checkIfAnyFileFound(FileSplitter.java:125)
at datameer.dap.sdk.importjob.FileSplitter.createSplits(FileSplitter.java:147)
at datameer.dap.common.partition.filerange.PartitionFileDataLinkSplitter.createSplits(PartitionFileDataLinkSplitter.java:46)
at datameer.dap.common.job.mr.input.v2.impl.CombineSplitter.createSplits(CombineSplitter.java:41)
at datameer.dap.common.graphv2.hadoop.ExternalDataReader.createSplits(ExternalDataReader.java:56)
at datameer.plugin.tez.input.TezInputFormat$DataHandleInputFormat.createSplits(TezInputFormat.java:62)
at datameer.plugin.tez.input.TezSplitGenerator.createSplitEvents(TezSplitGenerator.java:84)
at datameer.plugin.tez.input.TezSplitGenerator.initialize(TezSplitGenerator.java:73)
at datameer.plugin.tez.input.TezSplitGenerator.initializeEvents(TezSplitGenerator.java:69)
at datameer.plugin.tez.input.AbstractDatameerInputInitializer.initialize(AbstractDatameerInputInitializer.java:35)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
], DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0]
at datameer.com.google.common.base.Verify.verify(Verify.java:125)
at datameer.plugin.tez.TezJob.runTezDag(TezJob.java:178)
at datameer.plugin.tez.TezJob.runImpl(TezJob.java:152)
at datameer.dap.common.graphv2.ClusterJob.run(ClusterJob.java:113)
at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:191)
at datameer.dap.common.graphv2.ClusterSession.runClusterJobs(ClusterSession.java:311)
at datameer.dap.common.graphv2.job.DatameerJobSession.runClusterJobs(DatameerJobSession.java:144)
at datameer.dap.common.graphv2.job.DatameerJobSession.runAllJobsWithoutLifeCycle(DatameerJobSession.java:133)
at datameer.dap.common.graphv2.job.DatameerJobSession.lambda$runAllJobs$0(DatameerJobSession.java:117)
at datameer.dap.sdk.util.OperationChain.lambda$addOperation$0(OperationChain.java:14)
at datameer.dap.sdk.util.OperationChain.executeAll(OperationChain.java:30)
at datameer.dap.common.graphv2.job.DatameerJobSession.runAllJobs(DatameerJobSession.java:119)
at datameer.dap.common.graphv2.JobExecutionPlanRunner.run(JobExecutionPlanRunner.java:105)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:123)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:215)
at datameer.dap.common.security.RunAsThread$1.run(RunAsThread.java:34)
at datameer.dap.common.security.RunAsThread$1.run(RunAsThread.java:30)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at datameer.dap.common.impersonation.ClusterAwareUgiImpersonator.doAs(ClusterAwareUgiImpersonator.java:43)
at datameer.dap.common.impersonation.ConfigurableImpersonator.doAs(ConfigurableImpersonator.java:34)
at datameer.dap.common.security.RunAsThread.run(RunAsThread.java:30)
INFO [2019-12-31 08:35:47.351] [JobExecutionPlanRunner] (ClusterSession.java:220) - -------------------------------------------
INFO [2019-12-31 08:35:47.351] [JobExecutionPlanRunner] (ClusterSession.java:81) - Committing failed job and moving job output from 'hdfs://nameservice1/apps/datameer/temp/job-1042387' to 'hdfs://nameservice1/apps/datameer/workbooks/979/1042387'.
INFO [2019-12-31 08:35:47.403] [JobExecutionPlanRunner] (ClusterSession.java:129) - Completed job flow with FAILURE and 0 completed cluster jobs. (hdfs://nameservice1/apps/datameer/workbooks/979/1042387)
INFO [2019-12-31 08:35:47.403] [JobExecutionPlanRunner] (PoolingTezSessionFactory.java:147) - Closing ReuseSessionFactory{source=AlwaysNewSessionFactory{}}.
INFO [2019-12-31 08:35:47.403] [JobExecutionPlanRunner] (TezSessionImpl.java:70) - Closing TezSessionImpl{clientName=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20), applicationId=application_1569828507081_190296}
INFO [2019-12-31 08:35:47.403] [JobExecutionPlanRunner] (TezClient.java:652) - Shutting down Tez Session, sessionName=Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20), applicationId=application_1569828507081_190296
INFO [2019-12-31 08:35:47.942] [JobExecutionPlanRunner] (HarBuilder.java:78) - Created har file at hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-metadata.har.tmp out of [hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/cluster-jobs.json, hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-conf.xml, hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-definition-PUSH_BODY.json, hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-definition-SEND.json, hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-definition-app_info.json, hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-definition.json, hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-plan-compiled.dot, hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-plan-original.dot, hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/resources/job-workbook.json]. Moving it to hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-metadata.har
INFO [2019-12-31 08:35:48.024] [JobExecutionPlanRunner] (DatameerJobSession.java:49) - Deleting temporary job directory hdfs://nameservice1/apps/datameer/temp/job-1042387
INFO [2019-12-31 08:35:48.038] [JobExecutionPlanRunner] (DatameerJobStorage.java:191) - Copying job execution trace log from /opt/SP/apps/datameer/Datameer-7.2.10-cdh-5.16.1/temp/cache/dfscache/local-job-execution-traces/1042387 to hdfs://nameservice1/apps/datameer/jobhistory/979/1042387/job-execution-trace.log
INFO [2019-12-31 08:35:48.063] [JobScheduler worker5-thread-45] (DapJobCounter.java:174) - Job completed with failure with 0 cluster jobs and following counters:
INFO [2019-12-31 08:35:48.063] [JobScheduler worker5-thread-45] (DapJobCounter.java:177) - WORKBOOK_DROPPED_RECORDS: 0
INFO [2019-12-31 08:35:48.063] [JobScheduler worker5-thread-45] (DapJobCounter.java:177) - WORKBOOK_CONSUMED_RECORD_COUNT: 0
INFO [2019-12-31 08:35:48.063] [JobScheduler worker5-thread-45] (DapJobCounter.java:177) - WORKBOOK_CONSUMED_BYTES: 0
ERROR [2019-12-31 08:35:48.762] [JobScheduler thread-1] (JobScheduler.java:928) - Job 1042387 failed with exception.
java.lang.RuntimeException: Failed to run cluster job for 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor)'
at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:218)
at datameer.dap.common.graphv2.ClusterSession.runClusterJobs(ClusterSession.java:311)
at datameer.dap.common.graphv2.job.DatameerJobSession.runClusterJobs(DatameerJobSession.java:144)
at datameer.dap.common.graphv2.job.DatameerJobSession.runAllJobsWithoutLifeCycle(DatameerJobSession.java:133)
at datameer.dap.common.graphv2.job.DatameerJobSession.lambda$runAllJobs$0(DatameerJobSession.java:117)
at datameer.dap.sdk.util.OperationChain.lambda$addOperation$0(OperationChain.java:14)
at datameer.dap.sdk.util.OperationChain.executeAll(OperationChain.java:30)
at datameer.dap.common.graphv2.job.DatameerJobSession.runAllJobs(DatameerJobSession.java:119)
at datameer.dap.common.graphv2.JobExecutionPlanRunner.run(JobExecutionPlanRunner.java:105)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:123)
at datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:215)
at datameer.dap.common.security.RunAsThread$1.run(RunAsThread.java:34)
at datameer.dap.common.security.RunAsThread$1.run(RunAsThread.java:30)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at datameer.dap.common.impersonation.ClusterAwareUgiImpersonator.doAs(ClusterAwareUgiImpersonator.java:43)
at datameer.dap.common.impersonation.ConfigurableImpersonator.doAs(ConfigurableImpersonator.java:34)
at datameer.dap.common.security.RunAsThread.run(RunAsThread.java:30)
Caused by: datameer.com.google.common.base.VerifyException: Finished DAG 'Workbook job (1042387): KPI_Push_Sends#Format_date(Expression record processor) (06c52e95-49dc-472e-8e6b-235e24179f20)' (application_1569828507081_190296) with state FAILED and diagnostics: [Vertex failed, vertexName=Map for sheets:[SEND, Format_date] (be275db2-e772-4f4a-86d5-63336eb85aa7), vertexId=vertex_1569828507081_190296_1_00, diagnostics=[Vertex vertex_1569828507081_190296_1_00 [Map for sheets:[SEND, Format_date] (be275db2-e772-4f4a-86d5-63336eb85aa7)] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: Input:ExternalInputConnector{} initializer failed, vertex=vertex_1569828507081_190296_1_00 [Map for sheets:[SEND, Format_date] (be275db2-e772-4f4a-86d5-63336eb85aa7)], datameer.dap.sdk.importjob.NoInputPathFoundException: Could not find any existing file for
at datameer.dap.sdk.importjob.FileSplitter.checkIfAnyFileFound(FileSplitter.java:125)
at datameer.dap.sdk.importjob.FileSplitter.createSplits(FileSplitter.java:147)
at datameer.dap.common.partition.filerange.PartitionFileDataLinkSplitter.createSplits(PartitionFileDataLinkSplitter.java:46)
at datameer.dap.common.job.mr.input.v2.impl.CombineSplitter.createSplits(CombineSplitter.java:41)
at datameer.dap.common.graphv2.hadoop.ExternalDataReader.createSplits(ExternalDataReader.java:56)
at datameer.plugin.tez.input.TezInputFormat$DataHandleInputFormat.createSplits(TezInputFormat.java:62)
at datameer.plugin.tez.input.TezSplitGenerator.createSplitEvents(TezSplitGenerator.java:84)
at datameer.plugin.tez.input.TezSplitGenerator.initialize(TezSplitGenerator.java:73)
at datameer.plugin.tez.input.TezSplitGenerator.initializeEvents(TezSplitGenerator.java:69)
at datameer.plugin.tez.input.AbstractDatameerInputInitializer.initialize(AbstractDatameerInputInitializer.java:35)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
], DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0]
at datameer.com.google.common.base.Verify.verify(Verify.java:125)
at datameer.plugin.tez.TezJob.runTezDag(TezJob.java:178)
at datameer.plugin.tez.TezJob.runImpl(TezJob.java:152)
at datameer.dap.common.graphv2.ClusterJob.run(ClusterJob.java:113)
at datameer.dap.common.graphv2.ClusterSession.execute(ClusterSession.java:191)
... 19 more
INFO [2019-12-31 08:35:48.786] [JobScheduler thread-1] (JobScheduler.java:1013) - Computing after job completion operations for execution 1042387 (type=NORMAL)
INFO [2019-12-31 08:35:48.786] [JobScheduler thread-1] (JobScheduler.java:1017) - Finished computing after job completion operations for execution 1042387 (type=NORMAL) [0 sec]
WARN [2019-12-31 08:35:49.040] [JobScheduler thread-1] (JobScheduler.java:825) - Job DapJobExecution{id=1042387, type=NORMAL, status=ERROR} completed with status ERROR.
/GNE/4_Final/KPI_Push_Sends.wbk: 1042387/application_1569828507081_190296/tasklog-failed-dag_1569828507081_190296_1.dot-input.log
/GNE/4_Final/KPI_Push_Sends.wbk: 1042387/application_1569828507081_190296/tasklog-failed-history.txt.appattempt_1569828507081_190296_000001-input.log
/GNE/4_Final/KPI_Push_Sends.wbk: 1042387/application_1569828507081_190296/tasklog-failed-stderr-input.log
/GNE/4_Final/KPI_Push_Sends.wbk: 1042387/application_1569828507081_190296/tasklog-failed-stdout-input.log
/GNE/4_Final/KPI_Push_Sends.wbk: 1042387/application_1569828507081_190296/tasklog-failed-syslog-input.log
/GNE/4_Final/KPI_Push_Sends.wbk: 1042387/application_1569828507081_190296/tasklog-failed-syslog_dag_1569828507081_190296_1-input.log
-
Hello Ibrahim.
I hope you are doing fine.As far as I can see from the shared job log, it fails with datameer.dap.sdk.importjob.NoInputPathFoundException, which most likely means that the input data is not available for this Workbook. Have you tried to rerun the job, perhaps there is an issue with a certain DataNode?
To investigate further, I would need a complete JobTrace for this execution. How to Collect a Job Trace.
As well, Vodafone has a Datameer support subscription, thereby you could raise a regular support ticket, which will be a more convenient way to work on this issue.
To raise a support ticket, please navigate https://support.datameer.com/hc/en-us, sign in with your corporate email, go to Contact Support and Submit a new request.
If you do not have an account associated with your corporate email, please sing up first. You could also submit a ticket without logging in, simply going to https://support.datameer.com/hc/en-us -> Contact Support -> Submit New Request, but please be sure that you put your corporate email instead of gmail.com. I would still recommend to sing up though.
Post is closed for comments.
Comments
1 comment