I'm asking about a scenario of copying a big file to a remote server.
A simplest case is:
tar c myfile | ssh myserver tar x
If network connectivity is fast then all is fine.
On a slower network I do
tar c myfile | bzip2 -1 | ssh myserver tar xj
— making my transfer faster at the cost of CPU time.
Of course I can play with compression ratio, typically trying to guess the right one so my CPU is not too busy and the network is saturated.
Is there a compression utility or a compression flag that would tell bzip2
/xz
/… to compress as much as possible while the output buffer is busy?
Best Answer
The
zstd
compression utility has an option that turns on adaptive compression (the option was added inzstd
v1.3.6). This would adjust the compression to "the current perceived I/O conditions".See the
zstd
manual for more information.A complete pipeline may look something like this:
or
If you add
-v
to the firstzstd
in the pipeline, you will get a progress indicator line saying something likewhere the
(L7)
indicates the compression level. For any moderately large amount of data, you would expect it to fluctuate over time, showing thatzstd
is indeed adapting to the I/O conditions (and presumably also to the data itself).