Word – Regular Expression for Find and Replace in Microsoft Word

microsoft wordregex

I want to remove leading and trailing tags from country names.
In my example those tags are <li> and <a>.

<li><a href="http://afghanistan.makaan.com/">Afghanistan</a></li>
<li><a href="http://albanie.makaan.com/">Albanie</a></li>
<li><a href="http://algérie.makaan.com/">Algérie</a></li>

Result should be:

Afghanistan
Albanie
Algérie

In Microsoft Word, I want to use the Find and Replace feature to accomplish it with regular expression.

How can I use regular expressions in MS Word?

Best Answer

Instead of copying your input text to Word, copy it to Notepad++ or any other editor with full RegEx support.

A RegEx string to select everything outside of tags or everything between > and < signs would be.

(?<=>).*?(?=<)

enter image description here

  • (?<=>) is a look behind. It looks for > signs and acts as an anchor. This way you can exclude the search string, which is important since you don't want <Afghanistan
  • .*? is a lazy quantifier and selects everything until the very next expression
  • (?=<) is a look ahead and looks for a < sign but excludes the searched sign itself. Just like the look behind

But you don't want to select the country names. You want to remove every tag. You need the opposite of the first regular expression. Somthing like

<.*?>

enter image description here

  1. Open Notepad++ search & replace dialog
  2. Select Use regular expressions
  3. Find what: <.*?>
  4. Replace with: nothing
Related Question