Grep two conditions, one negated, without a pipeline

grep

Assume we have the following test.txt:

# commented line, no match
# commented line, would match /app/
# again commented, would match app
non commented line, matchin /app
non commented line, no match

I would like to get all lines that contain the word 'app', – but not those that have a comment, – and I would like the filename to be output.

The trivial grep -H 'app' test.txt, obviously, matches everything and does not avoid lines starting with number/hash character #:

term1.png

A pipeline with a second grep with -v, --invert-match option generally messes up the colors, and to preserve the -H filenames, I would not be able to specify a negated match for ^# (i.e. a number/hash character at the start of a line) so I'd have to use a beast like grep -H app test.txt --color=always | grep -v '\[K:.\[m.\[K#' to preserve colors:

term2.png

… but only after doing something like grep -H app test.txt --color=always | hexdump -C so I can see the right combo of characters, which is, mildly speaking, tedious.

And unfortunately, seemingly one cannot use the -v option to specify its own (negated) pattern in a combo with -e PATTERN, --regexp=PATTERN option which can specify multiple search patterns:

:tmp$ grep -H -e 'app' -v '^#' test.txt
grep: ^#: No such file or directory
test.txt:# commented line, no match
test.txt:non commented line, no match
test.txt:

Here, grep interprets '^#' to be a filename, not a search pattern – so the -v inverts the matching of app, and I get the wrong results from the expected one. Otherwise, in this example, the expected output is only one line:

test.txt:non commented line, matchin /app

… with properly colored filename, and matches.

So, is there a way to achieve this – but without the messy pipeline given above, and simply using ^# as the pattern to be avoided?

Best Answer

grep -E '^([^#].*)?app' ./infiles* /dev/null

I guess the comments already nearly had it anyway, but if you make the head of line [^#] not-comment match ?optional, then you either get lines that begin with the match app or you get lines which begin with something else and then eventually match app - but either way, you don't get lines that begin with #.

Regarding the colors - well... that depends on the grep and the regexp, but a standard GNU grep should highlight the whole match up to the last app match. If you would like it more specific you can do info grep to have a look at what environment vars a GNU grep will consider when highlighting and configure them appropriately , or, failing a satisfactory result in that vein, highlight it yourself.

Related Question