I hope I'm just missing the obvious here, but how in the world do I match zero or more spaces with Microsoft Word 2010's "Regex" engine?
As a silly example, I want to match all of the following in a capture group:
cowseat grass
cows eat grass
cows eat grass
cows eat grass
cows eat grass
I would normally do (cows\s*eat grass)
and be done with it. But I can't see how to match zero or more spaces. I want to capture the whole phrase in a capture group, but I have variable number of spaces.
I've been using this document as a reference.
Best Answer
The document you linked to shows that Microsoft's "regular expressions" aren't really regular expressions at all; they're a bizarre hybrid (bastard child, rather) of shell-style globbing (http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm) and true regular expressions.
Since the glob syntax makes use of the
*
character as a synonym for the regex.*
, and Microsoft decided (as mentioned in a comment) to make@
equivalent to the regex quantifier+
instead of*
(which is stupid sincea+
is equivalent toaa*
for any atoma
, making+
unnecessary), it looks like you're out of luck.My personal opinion is that (1) this is stupid and (2) calling these patterns "regular expressions" is misleading at best, but unfortunately I don't see any way around this except for abandoning Word in favor of a tool that properly supports regex. (Though I suppose in theory you could try to parse the xml-ish format of the docx file itself, extract the text, and then apply your regex....)