Shell Script – Read First Line from .gz File Without Decompressing

gziplogsshell-script

I have a huge log file compressed in .gz format and I want to just read the first line of it without uncompressing it to just check the date of the oldest log in the file.

The logs are of the form:

YYYY-MM-DD Log content asnsenfvwen eaifnesinrng
YYYY-MM-DD Log content asnsenfvwen eaifnesinrng
YYYY-MM-DD Log content asnsenfvwen eaifnesinrng

I just want to read the date in the first line which I would do like this for an uncompressed file:

read logdate otherstuff < logfile.gz
echo $logdate

Using zcat is taking too long.

Best Answer

Piping zcat’s output to head -n 1 will decompress a small amount of data, guaranteed to be enough to show the first line, but typically no more than a few buffer-fulls (96 KiB in my experiments):

zcat logfile.gz | head -n 1

Once head has finished reading one line, it closes its input, which closes the pipe, and zcat stops after receiving a SIGPIPE (which happens when it next tries to write into the closed pipe). You can see this by running

(zcat logfile.gz; echo $? >&2) | head -n 1

This will show that zcat exits with code 141, which indicates it stopped because of a SIGPIPE (13 + 128).

You can add more post-processing, e.g. with AWK, to only extract the date:

zcat logfile.gz | awk '{ print $1; exit }'

(On macOS you might need to use gzcat rather than zcat to handle gzipped files.)

Related Question