Background
Consider the following text:
There are three types of font families: serif, sans serif, and
teletype. To switch between these families, use <cmd>rm</cmd> for
serif, <cmd>ss</cmd> for sans serif, and <cmd>tt</cmd> for teletype.
I would like to change <cmd>x</cmd>
to {{cmd|x}}
, as follows:
There are three types of font families: serif, sans serif, and
teletype. To switch between these families, use {{cmd|rm}} for
serif, {{cmd|ss}} for sans serif, and {{cmd|tt}} for teletype.
Problem
The regular expression for non-greedy matches is tricky. For example, the following does not work in vim:
:%s/<cmd>\(.*\)<\/cmd>.\{-}/{{cmd|\1}}/
Nor does the following, with sed:
sed -e "/(<cmd>\(.*\)</cmd>).\{-}/{{cmd|\1}}/"
The parenthesis try to match parenthesis, rather than group the expression to apply the non-greedy operator of either \{-}
or ?
. Escaping the parenthesis is used for backreferences, which is only required for the text content inside the <cmd>
tag.
Question
What is the correct syntax to non-greedily replace all occurrences of <cmd>x</cmd>
with {{cmd|x}}
in a file?
Note: This is not an attempt to parse HTML using regex. 😉
Best Answer
I tried this in VIM:
%s/<cmd>\(.\{-}\)<\/cmd>/{{cmd|\1}}/g
, and it converts your demo text to:It seems like your first regular expression in VIM is really close to solve your puzzle, but the usage of
.\{-}
is not in the correct place.I get the hint from this answer: https://stackoverflow.com/questions/1305853/how-can-i-make-my-match-non-greedy-in-vim