Shell – How to decompress and print the last few lines of a compressed text file

command linecompressionshell

I have 6 gzipped text files, each of which is ~17G when compressed. I need to see the last few lines (decompressed) of each file to check whether a particular problem is there. The obvious approach is very slow:

for i in *; do zcat "$i" | tail -n3; done

I was thinking I could do something clever like:

for i in *; do tail -n 30 "$i" | gunzip | tail -n 4 ; done

Or

for i in *; do tac "$i" | head -100 | gunzip | tac | tail -n3; done

But both complain about:

gzip: stdin: not in gzip format

I thought that was because I was missing the gzip header, but this also fails:

$ aa=$(head -c 300 file.gz)
$ bb=$(tail -c 300 file.gz)
$ printf '%s%s' "$aa" "$bb" | gunzip
gzip: stdin: unexpected end of file

What I am really looking for is a ztail or ztac but I don't think those exist. Can anyone come up with a clever trick that lets me decompress and print the last few lines of a compressed file without decompressing the entire thing?

Best Answer

You can't, as it has been already said, if the files have been compressed with standard gzip. If you have control over the compression, you can use dictzip to compress the files, it compresses the files in separate blocks and you can decompress just the last block (typically 64KB). And it is backward compatible with gzip, meaning the dictzipped file is perfectly legal gzipped file as well.

Other possibility would be if you get the gzipped file as a concatenation of several already gzipped files, you could search for the last gzip signature and decompress everything after that.

Related Question