Triple compression and I only save 1% in space

bzip2compressiongziptar

I've been trying to save space on my linux server, and I had a folder containting, in subfolders, 22GB of images.

So I decided to compress them.

First I used tar:

tar -zcf folder.tar folder 

Then gzip

gzip folder

And finally, for good measure, just in case, bzip2

bzip2 folder

And after all that, the total of all the folder.tar.gz.bzip2s, came to still 22GB! With, using finer precision, a 1% space saving!

Have I done something wrong here? I would expect many times more than a 1% saving!

How else can I compress the files?

Best Answer

Compression ratio is very dependent of what you're compressing. The reason text compresses down so well is because it doesn't even begin to fully utilize the full range of numbers representable in the same binary space. So formats that do (e.g compressed files) can store the same information in less space just by virtue of using all those binary numbers that mean nothing in textual encodings and can effectively represent whole progressions of characters in a single byte and get a good compression ratio that way.

If the files are already compressed, you're typically not going to see much advantage to compressing them again. If that actually saved you additional space it's probably an indication that the first compression algorithm kind of sucks. Judging from the nature of the question I'm going to assume a lot of these are media files and as such are already compressed (albeit with algorithms that prioritize speed of decompression) and so you're probably not going to get much from them. Sort of a blood from a stone scenario: they're already as small as they could be made without losing information.

If I'm super worried about space I just do a "bzip2 -9" and call it good. I've heard good things about the ratio on XZ though. I haven't used XZ myself (other than to decompress other people's stuff), but it's supposed to have a better ratio than bzip2 but take a little longer to compress/decompress.

Related Question