Goal
Troubleshoot problematic Import Jobs and Export Jobs using an sFTP connection by implementing additional logging to debug and to determine what is preventing a successful job run.
Learn
First, occasionally there is a caching problem depending on the version of SSH/sFTP running on the host. Attempt re-running the job with the following Custom Property.
fs.sftp.enable.session-cache=false
If this does not resolve the issue, remove the above parameter.
To begin debug troubleshooting, configure the artifact in question for a specific execution framework. Then implement the enhanced logging.
For Tez, related jobs should additionally be set within the import-specific Custom
Properties
:
das.execution-framework=Tez fs.sftp.enable.debug=true
tez.task.log.level=DEBUG
tez.am.log.level=DEBUG
Set the Default
log
severity
to:
TRACE
And Logging
Customization
of:
log4j.category.datameer=TRACE log4j.category.datameer.awstasks=DEBUG log4j.category.awstasks.com.jcraft=DEBUG log4j.category.org.apache.hadoop=DEBUG
Further Information
Often, comparing the sshd_config and ssh_config files from the Datameer Host, Data Nodes, and sFTP Host can be a quick path to resolving sFTP issues. Notably, supported authentication and encryption mechanisms should be identical on all machines.
Comments
0 comments
Please sign in to leave a comment.