Notepad++ – search for same things that are in line twice

command lineenvironment-variablesnotepad

I would like to know if there is way to search file for two things in same line. For example if I want to search line with "variable >=" and "variable >=". Problem is that I dont know what "variable" is (well there is a lot of different variables in file and I am searching for duplicate variable check in one line).

Can anyone help me with this?

Best Answer

You have a couple of options...

Either way before starting:

  • open the "Find" dialog (Ctrl+f), or "Replace" (if you know what you want to do next),
  • select the "Regular expression" radio button in the bottom-left of the dialog.
  • Here, I will be assuming you're looking for duplicates of patterns like variable >= something or hour >= NUM.
  • Also, I will group as much as possible, so you can later be able to replace by keeping, or throwing away, any part as needed.

(1) Explicit Find

You know the duplicates and you can find them explicitly, e.g.:

((variable)\s*>=\s*\S+)(.*)(\2\s*>=\s*\S+)

Or, for "hour", just replace the word "variable" with the word "hour":

((hour)\s*>=\s*\S+)(.*)(\2\s*>=\s*\S+)

Explanation:

Every set of brackets, from left to right, is a group. Therefore, you'll have the following:

Group 1: ((variable)\s*>=\s*\S+) : Finds a string that begins with "variable", followed by \s (space) and * means any number of spaces (therefore, you can have "variable>=" or "variable >="), the the characters >= then more \s* (any spaces), then finally ANY NON-space character \S+ (the + says there must be at least one).

Group 2: (variable) : Group 2 is within group 1, and it's just a way to extract the name "variable".

Group 3: (.*) : ANYthing between the two duplicates you'll find. This allows you to do something with this extra text, if it exists.
WARNING, if there are triplicates (or more), this will consumer the patterns in the middle, making group1 and group4 contain ONLY the first and last duplicates. If you want to find consecutive duplicates, then change this part to (.*?); the ? makes it non-greedy, i.e. will find the minimum . (anything).

Group 4: (\2\s*>=\s*\S+) : Finally, this is the duplicate. The reason it is a duplicate, is because the pattern is the same as group 1, except, it uses \2, which is just a way to say whatever is in group 2. In this case it's the word "variable".

The second pattern for "hour" as you'll see is identical, except it looks for "hour" rather than "variable".

(2) Find Unknown Duplicate Patterns

With a slight modification, you can search for ANY duplicates of the same pattern:

((\w+)\s*>=\s*\S+)(.*)(\2\s*>=\s*\S+)

Explanation:

This is identical to finding duplicates with explicitly known names. The difference here is the use of \w+ (any alphanumeric word), instead of a word like "variable"/"hour".

\w+ : \w matches any word character (including uppercase, lowercase and numbers, but not punctuation or other symbols). The + is again a way to say at least one. Therefore, with \w+ you'll find any alphanumeric word.

Related Question