Linux – Using Grep to Find Multiple Repeating Characters in a Word

greplinux

I have a word such as "interlinking" where the letters 'in' are repeated three times in the word. How can I search a dictionary.txt file using grep to find other words that have two letter repeats three times in a word, such as 'priestesses' contain the 'es' two characters three times.

Best Answer

This calls for backreferences!

If you are ever referring to something you have already matched, and you want to match it again, use backreferences.

grep '(..)(.*\1){<n - 1>}' <file>
  • .* matches any sequence of characters
  • (..) matches any two characters
  • \1 matches the first group, in this case the (..) near the beginning

Substitute <n - 1> for the length of the sequence minus one, and <file> with the name of the file you want to look for (or omit it to use stdin).

This may not be the most efficient solution, but it works.

Related Question