Regex – How to Remove Extra Spaces Between Tags in Notepad++

notepadregexwindows 10

I want to remove every two or more spaces between specific tags and leave just a space instead:

For example:

<p class="text_obisnuit"> The context of articles, stories, and conversations helps you figure out and understand the meaning of English words in the text that are new to you. </p>

My desire output:

<p class="text_obisnuit">The context of articles, stories, and conversations helps you figure out and understand the meaning of English words in the text that are new to you.</p>

I tried something but it did not work

(?<=<p class="text_obisnuit">)\s*|\s*(?=</p>)

Best Answer

This removes 2 or more spaces only inside <p class="text_obisnuit"> and </p> and keep any other multiple spaces.

  • Ctrl+H
  • Find what: (?:<p class="text_obisnuit">|\G)(?:(?!</p>).)*?\s\K\s+
  • Replace with: LEAVE EMPTY
  • check Wrap around
  • check Regular expression
  • DO NOT CHECK . matches newline depending if you want to match multiple lines or not.
  • Replace all

Explanation:

(?:                         # start non capture group
  <p class="text_obisnuit"> # literally
 |                          # OR
  \G                        # restart from position of last match
)                           # end group
(?:                         # start non capture group
  (?!</p>)                  # negative lookahead, make sure we haven't reach </p>
  .                         # any character
)*?                         # group may appear 0 or more times, not greedy
\s                          # a space
\K                          # forget all we have seen until this position
\s+                         # 1 or more spaces

Given text:

other     text

<p class="text_obisnuit">  The context of articles,   stories, and conversations helps you     figure out and understand the meaning   of English words in the text that are new to you.   </p>

other    text

Result for given example:

other     text

<p class="text_obisnuit"> The context of articles, stories, and conversations helps you figure out and understand the meaning of English words in the text that are new to you. </p>

other    text

Note: it keeps space just after <p...> and just before </p>


If you want to remove these spaces, you have to run another regex:

  • Ctrl+H
  • Find what: (?<=<p class="text_obisnuit">)\s+|\s+(?=</p>)
  • Replace with: LEAVE EMPTY
  • UNcheck Match case
  • check Wrap around
  • check Regular expression
  • Replace all

Explanation:

(?<=                        # start positive lookbehind, make sure we have 
  <p class="text_obisnuit"> # literally
)                           # end lookbehind
\s+                         # 1 or more spaces
|                           # OR
\s+                         # 1 or more spaces
(?=                         # start positive lookahead
  </p>                      # literally
)                           # end lookahead

Result for given example:

other     text

<p class="text_obisnuit">The context of articles, stories, and conversations helps you figure out and understand the meaning of English words in the text that are new to you.</p>

other    text
Related Question