Attempting to run a job that leverages S3 storage, the following error is thrown in the job.log
UnrecoverableException: java.io.InterruptedIOException: getFileStatus on s3://bucketname/Datameer/jobhistory/confID/jobID/job.log: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
The connection pool can not release connections fast enough for the workloads being run.
Disable the S3 connection cache. This instructs Datameer jobs to release connections fully after use, instead of leaving them open for faster access.
Set the following property within the Datameer Cluster Configuration:
1. Navigate to the
2. Click on
3. Within the
Custom Propertiestext area, add the following key/value pair:
Note: this may cause S3 related jobs to have a minor reduction in performance, due to needing to re-establish connections for every data retrieval from S3.