Sed/awk replace a specific pattern under another pattern

awksed

So I have a file like so

[ABC]
value1=bla
value2=bla
value3=bla
[XYZ]
value1=bla
value2=bla
value3=bla

And I would like to replace the value1 under the [ABC] block with "value1=notbla" and not under the [XYZ] block. I've already tried

sed '/ABC/{n;s/.*/change/}'  file

But that would only work when trying to go to the next line and would not change the specific pattern (ie. if value1 was under value 2)
What sed or awk command could I use if I didn't know the specific line numbers.
And can you please label the function of each sed or awk tag used. I would really appreciate it.

Best Answer

Sed can handle this quite easily. It's a single "substitute" command, prefixed with an address range. I've added extra spacing for better readability:

sed -e '/^\[ABC\]$/ , /^\[.*\]$/     s/^\(value1=\).*$/\1notbla/'

Without the extra spacing, it's:

sed -e '/^\[ABC\]$/,/^\[.*\]$/s/^\(value1=\).*$/\1notbla/'

You don't really need anchored regexes, but they may be safer in some cases of unusual inputs. A slightly shorter version with unanchored regexes is:

sed -e '/\[ABC\]/,/^\[/s/^\(value1=\).*$/\1notbla/'

Explanation:

You asked for each flag or option to be explained, and I've got the time, so here you go. I'm explaining the final (shortest) version out of the three Sed commands listed above.

The first part of the line is an address range: /startregex/,/stopregex/ The substitute command which follows the address range is only applied to lines from startregex to stopregex (inclusive).

In this case the start regex is /\[ABC\]/. Square brackets are usually special characters within a regex, so we put a backslash before each to signify literal square bracket characters.

The stop regex is /^\[/, which uses the special regex character ^ to signify the start of a line. This pattern will match any line that starts with a literal left square bracket ([).

The substitute command is basically quite simple; the general format is s/findregex/replacetext/. It can also have special flags placed after the final / to modify its behavior, but I'm not using any such flags here.

The "find regex" is ^\(value1=\).*$.

The caret (^) matches the start of the line, as mentioned earlier, and the dollar sign ($) matches the end of the line. So this whole pattern must match an entire line, not merely part of one.

The parentheses (()), unlike square brackets, are non-special by default in regexes, so we put the backslashes before them to give them their special meaning. They allow parts of the matched text (the text matched by the "find regex") to be used in the replacement text. Specifically, the \1 in the replacement text means, "The text matched within the first set of parentheses in the regex." In this case, that is always just "value1=".

The final element in the "find regex" is .*. The dot (.) means "any single character," and the asterisk (*) means "any number of times (zero or more)." So the dot star (.*) matches the entire rest of the line, after the equals sign.

"notbla" in the replacement text is just static text, nothing special about it.


To really learn Sed properly, I highly recommend the Grymoire Sed tutorial, which is free online.

Related Question