Using sed to find and replace complex string (preferrably with regex)

quotingregular expressionsed

I have a file with the following contents:

<username><![CDATA[name]]></username>
<password><![CDATA[password]]></password>
<dbname><![CDATA[name]]></dbname>

and I need to make a script that changes the "name" in the first line to "something", the "password" on the second line to "somethingelse", and the "name" in the third line to "somethingdifferent". I can't rely on the order of these occurring in the file, so I can't simply replace the first occurrence of "name" with "something" and the second occurrence of "name" with "somethingdifferent". I actually need to do a search for the surrounding strings to make sure I'm finding and replacing the correct thing.

So far I have tried this command to find and replace the first "name" occurrence:

sed -i "s/<username><![CDATA[name]]><\/username>/something/g" file.xml

however it's not working so I'm thinking some of these characters might need escaping, etc.

Ideally, I'd love to be able to use regex to just match the two "username" occurrences and replace only the "name". Something like this but with sed:

<username>.+?(name).+?</username>

and replace the contents in the brackets with "something".

Is this possible?

Best Answer

sed -i -E "s/(<username>.+)name(.+<\/username>)/\1something\2/" file.xml

This is, I think, what you're looking for.

Explanation:

  • parentheses in the first part define groups (strings in fact) that can be reused in the second part
  • \1, \2, etc. in the second part are references to the i-th group captured in the first part (the numbering starts with 1)
  • -E enables extended regular expressions (needed for + and grouping).
Related Question