Linux – use tar to extract and immediately compress files from tarball

gziplinuxtar

I have a compressed tarball (eg foo.tar.gz) that I wish to extract all the files from, but the files within the tarball are not compressed. That is to say, the contents of foo.tar.gz are un-compressed txt files.

There is not enough space on my filesystem to extract the files directly, so I wish to extract these files and immediately compress them as they are written to the disk. I can't simply extract the files and then gzip the extracted files because, as I've said, there's not enough space on the filesystem. I would also like to ensure that the original filenames, including their directories are faithfully preserved on disk. So if one of the files in the tarball is /a/b/c/foo.txt, at the end of the process I would like to have /a/b/c/foo.txt.gz

How can I accomplish this?

Best Answer

It won't be fast, especially for a large tarball with lots of files, but in bash you can do this:

tar -tzf tarball.tgz | while IFS= read -r file; do
    tar --no-recursion -xzf tarball.tgz -- "$file"
    gzip -- "$file"
done

The first tar command extracts the names of the files in the tarball, and passes those names to a while read ... loop. The file name is then passed to a second tar command that extracts just that file, which is then compressed before the next file is extracted. The --no-recursion flag is used so trying to extract a directory doesn't extract all the files under that directory, which is what tar would normally do.

You'll still need enough free space to store somewhat more than the original size of the compressed tarball.

Related Question