What does \? mean in a regular expression

grepregular expression

The following command is used to search for a 7-digit phone number:

grep "[[:digit:]]\{3\}[ -]\?[[:digit:]]\{4\}" file

What does \? stand for?

Best Answer

It's like ? in many other regular expression engines, and means "match zero or one of whatever came before it".

In your example, the \? is applied to the [ -], meaning it tries to match a space or a minus, but that the space or minus is optional.

So any of these will match:

555 1234
555-1234
5551234

The reason it's written as \? rather than ? is for backwards compatibility.

The original version of grep used a different type of regular expression called a "basic regular expression" where ? just meant a literal question mark.

So that GNU grep could have the zero or one functionality, they added it, but had to use the \? syntax so that scripts that used ? still worked as expected.

Note that grep has an -E option which makes it use the more common type of regular expression, called "extended regular expressions".

man 1 grep:

   -E, --extended-regexp
          Interpret PATTERN as an extended regular expression
          (ERE, see below).  (-E is specified by POSIX.)

   -G, --basic-regexp
          Interpret PATTERN as a basic regular expression (BRE, see below).
          This is the default.

...

Repetition
    A regular expression may be followed by one of several repetition operators:
    ?      The preceding item is optional and matched at most once.

...

    grep understands three different versions of regular expression syntax:
    “basic,” “extended” and “perl.”

...

Basic vs Extended Regular Expressions
    In basic regular expressions the meta-characters ?, +, {, |, (, and )
    lose their special meaning; instead use the backslashed versions
    \?, \+, \{, \|, \(, and \).

Further info:

Related Question