How to recompress 2 million gzip files without storing them twice

compressiondisk-usagelarge filestar

I have about 2 million (60GiB) of gzipped small files and I would like to create a compressed archive containing all of them in an uncompressed version. Unfortunately, I cannot just uncompress them all and then create the compressed archive as I only have about 70GiB of free disk space. In other words, how can I do an equivalent of tar --file-filter="zcat" zcf file.tar.gz directory if the command-line switch like --file-filter doesn't exist in GNU tar?

Best Answer

An option could be to use avfs (here assuming a GNU system):

mkdir ~/AVFS &&
avfsd ~/AVFS &&
cd ~/AVFS/where/your/gz/files/are/ &&
find . -name '*.gz' -type f -printf '%p#\0' |
  tar --null -T - --transform='s/.gz#$//' -cf - | pigz > /dest/file.tar.gz
Related Question