I am trying to look for lines in a file that have one or more instances of 1234
, but no other numbers (non-digit characters are allowed). Any other number should cause the line not to match.
Valid examples:
1234
1234 xxx
xxx 1234
1234 1234
1234 xxx 1234
Invalid examples:
12341234
12345
1234xxx345
1234 345
1234xxx
xxx1234
1234xxx1234
This is what I have used:
grep -E '^([^0-9]*1234)+[^0-9]*$' file.txt
But this command also outputs 12341234 as valid, how do I prevent that?
Best Answer
Explanation
^[^0-9]*1234
: find the first match of1234
, which may be preceded by non-digit characters.([^0-9]+1234)*
: there may be further iterations of1234
, but these must be separated from the first1234
(and other1234
) by non-digit characters (hence use+
).[^0-9]*$
: match the entire line (with$
). There may be non-digit characters after the final1234
(but not necessarily, hence*
).EDIT
If
1234
must be delimited by spaces (or be at the beginning or end of the line), then useExplanation
^([^0-9]* )?
: there may be non-digit characters to start with, as long as they end with a space.1234
: find the first (required) match of1234
.(( [^0-9]*)? 1234)*
: I'll work through the parentheses backwards. There may be (zero or more) further copies of1234
, but these must be preceded by a space, i.e.1234
. Before this space, there may be non-digit characters, which is fine as long as these are separated from the preceding copy of1243
by another space, i.e.( [^0-9]*)?
.( [^0-9]*)?$
: there may be non-digit characters to end with, as long as they are preceded by a space.