Mass decompress gzip files without gz extension

compressionfilesgzip

I have a massive number of files with extensions like .0_1234 .0_4213 and .0_4132 etc. Some of these are gzip compressed and some are raw email. I need to determine which are compressed files, decompress those, and rename all files to a common extension once all compress files are decompressed. I've found I can use the file command to determine which are compressed, then grep the results and use sed to whittle the output down to a list of files, but can't determine how to decompress the seemingly random extensions. Here's what I have so far

file *|grep gzip| sed -e 's/: .*$//g'

I'd like to use xargs or something to take the list of files provided in output and either rename them to .gz so they can be decompressed, or simply decompress them in-line.

Best Answer

Don't use gzip, use zcat instead which doesn't expect an extension. You can do the whole thing in one go. Just try to zcat the file and, if that fails because it isn't compressed, cat it instead:

for f in *; do 
    ( zcat "$f" || cat "$f" ) > temp && 
    mv temp "$f".ext && 
    rm "$f" 
done

The script above will first try to zcat the file into temp and, if that fails (if the file isn't in gzip format), it will just cat it. This is run in a subshell to capture the output of whichever command runs and redirect it to a temp file (temp). Then, the temp is renamed to the original file name plus an extension (.ext in this example) and the original is deleted.

Related Question