Grep caret appears to have no effect

grepregular expression

I was under the impression that a caret symbol means "beginning of the line" with Extended POSIX regular expressions.

However, when I use it with grep it behaves unexpectedly.

I am using GNU grep 2.5.4 on Ubuntu 10.04 Lucid Lynx.

I echo out a line ' hello', then pipe it to a grep that searches for "zero-or-more white-space characters followed by the letter h":

echo ' hello' | grep -E '[:space:]*h'
hello

grep finds it ok.

If I add a caret to indicate that I only want the pattern to match "zero-or-more white-space characters followed by the letter h at the beginning of the string":

echo ' hello' | grep -E '^[:space:]*h'

No matches are found. I would expect the string to have matched because it begins with white-space followed by h.

Why does this caret symbol prevent a match?

Best Answer

To find a space, you have to use [:space:] inside another pair of brackets, which will look like [[:space:]]. You probably meant to express grep -E '^[[:space:]]*h'

To explain why your current one fails:

As it stands, [:space:]*h includes a character class looking for any of the characters: :, s, p, a, c, and e which occur any number of times (including 0), followed by h. This matches your string just fine, but if you run grep -o, you'll find that you've only matched the h, not the space.

If you add a carat to the beginning, either one of those letters or h must be at the beginning of the string to match, but none are, so it does not match.

Related Question