Shell – grep and escaping a dollar sign

grepquotingregular expressionshell

I want to know which files have the string $Id$.

grep \$Id\$  my_dir/mylist_of_files

returns 0 occurrences.

I discovered that I have to use

grep \$Id$ my_dir/mylist_of_files

Then I see that the $Id is colored in the output, i.e. it has been matched.

How could I match the second $ and why doesn't \$Id\$ work.

It doesn't matter if the second $ is the last character or not.

I use grep 2.9.


Before posting my question, I used google…

I found an answer

To search for a $ (dollar sign) in the file named test2, enter:

grep \\$ test2

The \\ (double backslash) characters are necessary in order
to force the shell to pass a \$ (single backslash, dollar sign) to the
grep command. The \ (single backslash) character tells the grep
command to treat the following character (in this example the $) as a
literal character rather than an expression character. Use the fgrep
command to avoid the necessity of using escape characters such as the
backslash.

but I don't understand why grep \$Id works and why grep \\$Id\\$ doesn't.

I'm a little bit confused…

Best Answer

There's 2 separate issues here.

  1. grep uses Basic Regular Expressions (BRE), and $ is a special character in BRE's only at the end of an expression. The consequence of this is that the 2 instances of $ in $Id$ are not equal. The first one is a normal character and the second is an anchor that matches the end of the line. To make the second $ match a literal $ you'll have to backslash escape it, i.e. $Id\$. Escaping the first $ also works: \$Id\$, and I prefer this since it looks more consistent.¹

  2. There are two completely unrelated escaping/quoting mechanisms at work here: shell quoting and regex backslash quoting. The problem is many characters that regular expressions use are special to the shell as well, and on top of that the regex escape character, the backslash, is also a shell quoting character. This is why you often see messes involving double backslashes, but I do not recommend using backslashes for shell quoting regular expressions because it is not very readable.

    Instead, the simplest way to do this is to first put your entire regex inside single quotes as in 'regex'. The single quote is the strongest form of quoting the shell has, so as long as your regex does not contain single quotes, you no longer have to worry about shell quoting and can focus on pure BRE syntax.

So, applying this back to your original example, let's throw the correct regex (\$Id\$) inside single quotes. The following should do what you want:

grep '\$Id\$' my_dir/my_file

The reason \$Id\$ does not work is because after shell quote removal (the more correct way of saying shell quoting) is applied, the regex that grep sees is $Id$. As explained in (1.), this regex matches a literal $Id only at the end of a line because the first $ is literal while the second is a special anchor character.

¹ Note also that if you ever switch to Extended Regular Expressions (ERE), e.g. if you decided to use egrep (or grep -E), the $ character is always special. In ERE's $Id$ would never match anything because you can't have characters after the end of a line, so \$Id\$ would be the only way to go.