Bash – Use of Quotes in GNU Grep Regular Expressions

bashgrepquotingshell

I could see that certain patterns in the GNU Grep can be enclosed within brackets and certain others need not be. For example, matching the beginning of a word works only if it is enclosed within quotes.

user@host:~/Desktop$ grep -E '\<H' test
Hello World
user@host:~/Desktop$ grep -E \<H test

[test contains the string Hello World]

But matching end and beginning of file works without dollar:

user@host:~/Desktop$ egrep d$ test
Hello World

Why is it so? And what is the rule?

Best Answer

The quotes are expanded by the shell, they determine what grep sees.

With grep -E '\<H', the characters between the single quotes are passed literally, so grep sees the regex \<H containing the beginning-of-word anchor \<.

With grep -E \<H, the backslash character removes the special meaning of < in the shell, and grep sees the regex <H. You would see matches for a line like <Hello>.

With grep -E <H, the < character would have its special meaning in the shell as a redirection character, so grep would receive the contents of the file called H on its standard input.

With grep 'd$' or grep d\$, the dollar sign is quoted so it reaches grep: the regex is d$, matching a d at the end of a line.

With grep d$ test, the $ sign is not followed by a valid variable name or by valid punctuation (${, $(). When that happens, the shell passes the $ sign literally, so grep again sees the regex d$. $ is only expanded when it is followed by a valid variable name (even if the variable is undefined — what matters is that a name follows, as in $PATH or $fioejsfoeij or single-character variables such as $- or $$), or in the constructs ${…}, $(…), $((…)) (also $[…] in bash and zsh, and more constructs in zsh).

The complete rules for shell expansion are far too complex to describe in a post or a dozen. In practice it's enough to remember the usual cases:

  • \ (backslash) quotes the next character unless it's a newline, and the backslash is always stripped;
  • '…' (single quotes) quotes every character except ' itself;
  • "…" (double quotes) quote every character except "$\`, and \ inside double quotes causes the following character to be interpreted literally and is only stripped if the next character was special.
Related Question