I could see that certain patterns in the GNU Grep can be enclosed within brackets and certain others need not be. For example, matching the beginning of a word works only if it is enclosed within quotes.
user@host:~/Desktop$ grep -E '\<H' test
Hello World
user@host:~/Desktop$ grep -E \<H test
[test contains the string Hello World]
But matching end and beginning of file works without dollar:
user@host:~/Desktop$ egrep d$ test
Hello World
Why is it so? And what is the rule?
Best Answer
The quotes are expanded by the shell, they determine what
grep
sees.With
grep -E '\<H'
, the characters between the single quotes are passed literally, so grep sees the regex\<H
containing the beginning-of-word anchor\<
.With
grep -E \<H
, the backslash character removes the special meaning of<
in the shell, andgrep
sees the regex<H
. You would see matches for a line like<Hello>
.With
grep -E <H
, the<
character would have its special meaning in the shell as a redirection character, sogrep
would receive the contents of the file calledH
on its standard input.With
grep 'd$'
orgrep d\$
, the dollar sign is quoted so it reachesgrep
: the regex isd$
, matching ad
at the end of a line.With
grep d$ test
, the$
sign is not followed by a valid variable name or by valid punctuation (${
,$(
). When that happens, the shell passes the$
sign literally, sogrep
again sees the regexd$
.$
is only expanded when it is followed by a valid variable name (even if the variable is undefined — what matters is that a name follows, as in$PATH
or$fioejsfoeij
or single-character variables such as$-
or$$
), or in the constructs${…}
,$(…)
,$((…))
(also$[…]
in bash and zsh, and more constructs in zsh).The complete rules for shell expansion are far too complex to describe in a post or a dozen. In practice it's enough to remember the usual cases:
\
(backslash) quotes the next character unless it's a newline, and the backslash is always stripped;'…'
(single quotes) quotes every character except'
itself;"…"
(double quotes) quote every character except"$\`
, and\
inside double quotes causes the following character to be interpreted literally and is only stripped if the next character was special.