I want to remove leading and trailing tags from country names.
In my example those tags are <li>
and <a>
.
<li><a href="http://afghanistan.makaan.com/">Afghanistan</a></li>
<li><a href="http://albanie.makaan.com/">Albanie</a></li>
<li><a href="http://algérie.makaan.com/">Algérie</a></li>
Result should be:
Afghanistan
Albanie
Algérie
In Microsoft Word, I want to use the Find and Replace feature to accomplish it with regular expression.
How can I use regular expressions in MS Word?
Best Answer
Instead of copying your input text to Word, copy it to Notepad++ or any other editor with full RegEx support.
A RegEx string to select everything outside of tags or everything between
>
and<
signs would be.(?<=>)
is a look behind. It looks for>
signs and acts as an anchor. This way you can exclude the search string, which is important since you don't want<Afghanistan
.*?
is a lazy quantifier and selects everything until the very next expression(?=<)
is a look ahead and looks for a<
sign but excludes the searched sign itself. Just like the look behindBut you don't want to select the country names. You want to remove every tag. You need the opposite of the first regular expression. Somthing like
<.*?>