Consider the following input file:
1
2
3
4
Running
{ grep -q 2; cat; } < infile
doesn't print anything. I'd expect it to print
3
4
I can get the expected output if I change it to
{ sed -n 2q; cat; } < infile
Why doesn't the first command print the expected output ?
It's a seekable input file and per the standard under OPTIONS:
-q
Quiet. Nothing shall be written to the standard output, regardless of
matching lines. Exit with zero status if an input line is selected.
and further down, under APPLICATION USAGE (emphasize mine):
The
-q
option provides a means of easily determining whether or not a
pattern (or string) exists in a group of files. When searching several
files, it provides a performance improvement (because it can quit
as soon as it finds the first match)[…]
Now, per the same standard (in Introduction, under INPUT FILES)
When a standard utility reads a seekable input file and terminates
without an error before it reaches end-of-file, the utility shall
ensure that the file offset in the open file description is properly
positioned just past the last byte processed by the utility[…]
tail -n +2 file
(sed -n 1q; cat) < file
...
The second command is equivalent to the first only when the file is
seekable.
Why does grep -q
consume the whole file ?
This is gnu grep
if it matters (though Kusalananda just confirmed the same happens on OpenBSD)
Best Answer
grep
does stop early, but it buffers its input so your test is too short (and yes, I realise my test is imperfect since it's not seekable):starts at 6776 on my system. That matches the 32KiB buffer used by default in GNU grep:
outputs
Note that POSIX only mentions performance improvements
That doesn't set any expectations up for performance improvements due to partially reading a single file.