The word emmailloter
contains much more than i
between the bits matched by [a-f]
and [^ta]$
. The .
pattern only ever matches a single character, so if you want to match multiple characters between emma
and r
at the end, you will have to allow for multiple characters:
emm*[a-f]..*[^ta]$
With grep -E
(enabling extended regular expressions), ..*
could be written .+
, i.e. "match at least one character". The expression ..*
reads as "match a character, and then possibly more characters". In the same way, emm*
could be replaced by em+
, i.e. "e
followed by at least one m
" if using grep -E
.
This would match the string
blop-emmmmmmmmma-blarg-b
^^^^^^^^^^^^^^^^^^^
1111111111233333334
1: emm*
2: [a-f]
3: ..*
4: [^ta]$
(the matching part indicated by the ^
characters above), for example, and also emmailloter
:
emmailloter
^^^^^^^^^^^
11123333334
Testing:
$ grep -E 'emm*[a-f].+[^ta]$' MySQLServ
remembré
emmené
emmailloter
flemmard
Note that for the word remembré
, the match will be
remembré
^^^^^^^
1123334
not
remembré
^^^^^
11234
One way to visualise the matches using sed
:
$ sed -n -E 's/(emm*)([a-f])(.+)([^ta]$)/(\1)(\2)(\3)(\4)/p' MySQLServ
r(em)(e)(mbr)(é)
(emm)(e)(n)(é)
(emm)(a)(illote)(r)
fl(emm)(a)(r)(d)
This will only print matching lines, with each matched part of the regular expression in parentheses. This also assumes that you are using a sed
implementation that can be used to match French characters and that the locale environment variables are properly set up for doing that.
Compare this with what your original expression produces:
$ sed -n -E 's/(emm*)([a-f])(.)([^ta]$)/(\1)(\2)(\3)(\4)/p' MySQLServ
rem(em)(b)(r)(é)
(emm)(e)(n)(é)
fl(emm)(a)(r)(d)
Best Answer
By default, grep uses Basic Regular Expressions, you need to escape the braces to make grep match multiple characters:
Alternatively, you can use the
-E
option (or -P option for GNU grep, which uses Perl Compatible Regular Expressions) makinggrep
use Extended Regular Expressions, which can use braces without escaping them: