Search for a precise number with grep

grep

I am trying to look for lines in a file that have one or more instances of 1234, but no other numbers (non-digit characters are allowed). Any other number should cause the line not to match.

Valid examples:

  • 1234
  • 1234 xxx
  • xxx 1234
  • 1234 1234
  • 1234 xxx 1234

Invalid examples:

  • 12341234
  • 12345
  • 1234xxx345
  • 1234 345
  • 1234xxx
  • xxx1234
  • 1234xxx1234

This is what I have used:

grep -E '^([^0-9]*1234)+[^0-9]*$' file.txt

But this command also outputs 12341234 as valid, how do I prevent that?

Best Answer

grep -E '^[^0-9]*1234([^0-9]+1234)*[^0-9]*$' file.txt

Explanation

  • ^[^0-9]*1234: find the first match of 1234, which may be preceded by non-digit characters.
  • ([^0-9]+1234)*: there may be further iterations of 1234, but these must be separated from the first 1234 (and other 1234) by non-digit characters (hence use +).
  • [^0-9]*$: match the entire line (with $). There may be non-digit characters after the final 1234 (but not necessarily, hence *).

EDIT

If 1234 must be delimited by spaces (or be at the beginning or end of the line), then use

grep -E '^([^0-9]* )?1234(( [^0-9]*)? 1234)*( [^0-9]*)?$'

Explanation

  • ^([^0-9]* )?: there may be non-digit characters to start with, as long as they end with a space.
  • 1234: find the first (required) match of 1234.
  • (( [^0-9]*)? 1234)*: I'll work through the parentheses backwards. There may be (zero or more) further copies of 1234, but these must be preceded by a space, i.e. 1234. Before this space, there may be non-digit characters, which is fine as long as these are separated from the preceding copy of 1243 by another space, i.e.( [^0-9]*)?.
  • ( [^0-9]*)?$: there may be non-digit characters to end with, as long as they are preceded by a space.
Related Question