Configuring MapReduce Compression
MR1 | YARN | Description |
---|---|---|
To enable MapReduce intermediate compression: |
||
mapred.compress.map.output=true | mapreduce.map.output.compress=true | Should the outputs of the maps be compressed before being sent across the network. Uses SequenceFile compression. |
mapred.map.output.compression.codec= |
mapreduce.map.output.compress.codec= |
If the map outputs are compressed, how should they be compressed? (i.e. Snappy) |
To compress the final output of a MapReduce job: | ||
mapred.output.compress=true | mapreduce.output.fileoutputformat.compress=true | Should the job outputs be compressed? |
mapred.output.compression.type=BLOCK |
mapreduce.output.fileoutputformat.compress.type=BLOCK |
If the job outputs are to compressed as SequenceFiles, how should they be compressed? Should be one of NONE, RECORD or BLOCK. |
mapred.output.compression.codec= |
mapreduce.output.fileoutputformat.compress.codec= org.apache.hadoop.io.compress.GzipCodec |
If the job outputs are compressed, how should they be compressed? (i.e. Gzip) |
io.compression.codecs= org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec
|
A list of the compression codec classes that can be used for compression/decompression | |
RECOMMENDED:
- Using Compression with Hadoop and Datameer
- Compression Options in Hadoop - A Tale of Tradeoffs
- Compression - Hadoop: The Definitive Guide
Comments
0 comments
Please sign in to leave a comment.