Description
According to the Apache Hadoop documentation, history files are written by MapReduce jobs (in HDFS) to the .../history/done_intermediate/
directory. This location is configured in mapred-site.xml
via the property mapreduce.jobhistory.intermediate-done-dir
.
After a mapreduce job completes, logs are written to HDFS under this directory. The history server continuously scans the intermediate directory and moves any newly available logs to the directory specified by the mapreduce.jobhistory.done-dir
parameter in mapred-site.xml
. From this location, history server picks up the logs and displays them on the history server UI.
MapReduce Job History retention policy is controlled by the below properties.
mapreduce.jobhistory.cleaner.enable
- True / False. Default value isTrue
.mapreduce.jobhistory.cleaner.interval-ms
- How often the job history cleaner checks for files to delete, in milliseconds. Defaults to 86400000 (one day). Files are only deleted if they are older thanmapreduce.jobhistory.max-age-ms
.mapreduce.jobhistory.max-age-ms
- Job history files older than this many milliseconds will be deleted when the history cleaner runs. Defaults to 604800000 (1 week).
Comments
0 comments
Please sign in to leave a comment.