I have 6 gzipped text files, each of which is ~17G when compressed. I need to see the last few lines (decompressed) of each file to check whether a particular problem is there. The obvious approach is very slow:
for i in *; do zcat "$i" | tail -n3; done
I was thinking I could do something clever like:
for i in *; do tail -n 30 "$i" | gunzip | tail -n 4 ; done
Or
for i in *; do tac "$i" | head -100 | gunzip | tac | tail -n3; done
But both complain about:
gzip: stdin: not in gzip format
I thought that was because I was missing the gzip
header, but this also fails:
$ aa=$(head -c 300 file.gz)
$ bb=$(tail -c 300 file.gz)
$ printf '%s%s' "$aa" "$bb" | gunzip
gzip: stdin: unexpected end of file
What I am really looking for is a ztail
or ztac
but I don't think those exist. Can anyone come up with a clever trick that lets me decompress and print the last few lines of a compressed file without decompressing the entire thing?
Best Answer
You can't, as it has been already said, if the files have been compressed with standard
gzip
. If you have control over the compression, you can usedictzip
to compress the files, it compresses the files in separate blocks and you can decompress just the last block (typically 64KB). And it is backward compatible withgzip
, meaning the dictzipped file is perfectly legal gzipped file as well.Other possibility would be if you get the gzipped file as a concatenation of several already gzipped files, you could search for the last gzip signature and decompress everything after that.