Files compress better depending on the compression method. Sometimes it does not even compress, and the raw file is the smaller than any compressed file. So what compression method should be used to compress a file?

tl;dr

Compression methods

gzip, bzip2 and xz accept similar command line arguments:

  • -c compress to stdout. keeps the original file
  • -9 use best compression

compress accepts the -c flag to output to stdout (compress)

Getting the number of bytes

wc counts the number of lines, words, bytes or characters in a file (wc)

  • -c outputs the number of bytes in a file or stdin

Putting it together

wc -c $FILE will include the filename in the output, which may not be what you want. To ouptut just the bytes, use cat with wc:

cat $FILE | wc -c

To use best compression on gzip and count the number of bytes in the compressed output:

gzip -c9 $FILE | wc -c

To get the best compression method for a file using bzip2, gzip, xz and compress (or uncompressed):

bzip2 -c9 $FILE | wc -c
gzip -c9 $FILE | wc -c
xz -c9 $FILE | wc -c
compress -c $FILE | wc -c
cat $FILE | wc -c

Links: