7-Zip Compression – What Combining Methods Actually Does

7-zipcommand linecompression

The 7z command-line tool lets you specify multiple compression methods, e.g.:

# 7-zip archive type, strongest (9) compression, methods PPMd, BCJ2, LZMA2
$ 7z a -t7z -mx=9 -m0=PPMd -m1=BCJ2 -m2=LZMA2 myarchive.7z somefile.xml

All methods get used in some way, or at least specified in the metadata:

$ 7z l -slt myarchive.7z

7-Zip [64] 9.22 beta  Copyright (c) 1999-2011 Igor Pavlov  2011-04-18

Listing archive: myarchive.
7z

--
Path = myarchive.7z
Type = 7z
Method = LZMA2 PPMD BCJ2
[..]

----------
Path = somefile.xml
[..]
Method = PPMD:o32:mem192m BCJ2 LZMA2:48m
Block = 0

It does not appear to run the file through all three methods, picking the best. Rather, it apparently always picks the first, as changing the order of the method affects the file size significantly.

Even if I add multiple files, such as one XML file (PPMd yields best compression) and one binary file (LZMA2 does), it still lists all methods for both files, and doesn't appear to switch dynamically per file.

In fact, the documentation specifically says that "You can use any number of methods.", but it does not say wherefore.

What I'm trying to achieve is a per-file "try multiple methods, pick whichever is best" archive. I can of course manually achieve this with a little scripting, but presumably, chaining compression methods should do exactly that?

Best Answer

Generally, compressed data cannot be compressed (further) efficiently. After the first compression method has been applied, the file size cannot be decreased significantly.

The -mN=X is mainly for specifying filters (taken from Windows help file):

Supported filters:

Delta Delta filter (“It's possible to set delta offset in bytes. For example, to compress 16-bit stereo WAV files, you can set "0=Delta:4". Default delta offset is 1.”)

BCJ converter for x86 executables

BCJ2 converter for x86 executables (version 2) (“BCJ2 is a Branch converter for 32-bit x86 executables (version 2). It converts some branch instructions for increasing further compression.”)

ARM converter for ARM (little endian) executables

ARMT converter for ARM Thumb (little endian) executables

IA64 converter for IA-64 executables

PPC converter for PowerPC (big endian) executables

SPARC converter for SPARC executables

Also from the help file, an advanced example leveraging multiple output streams of the BCJ2 filter:

7z a -t7z archive.7z *.exe *.dll -m0=BCJ2 -m1=LZMA:d23 -m2=LZMA:d19 -m3=LZMA:d19      -mb0:1 -mb0s1:2 -mb0s2:3

adds *.exe and *.dll files to archive archive.7z using BCJ2 converter, LZMA with 8 MB dictionary for main output stream (s0), and LZMA with 512 KB dictionary for s1 and s2 output streams of BCJ2.

Related Question