Regex & Sed/Perl: Match word that ISN’T preceded by another word

perlregular expressionsed

I'd like to use sed or perl to replace all occurrences of a word that doesn't have a certain word in front of it.

For example, I have a text file that contains a plot of a movie and I want to replace all occurrences of a character's last name with their first name, but only if their first name doesn't come immediately before their last name.

Sample text might look like this:

John Smith and Jane Johnson talk about Smith's car.

I want it to look like this:

John Smith and Jane Johnson talk about John's car.

If I just do sed 's/Smith/John/' file, then I would have:

John John and Jane Johnson talk about John's car.

The first name that comes before the last name will always be the same. I don't have to deal with John Smith and Frank Smith. I just need a way to match Smith that doesn't have John preceding it.

Best Answer

Would be easy with any language where the regular expressions are capable to lookbehind. Of course, Perl is the first on list:

perl -pe 's/(?<!John\W)Smith/John/g' <<< "John Smith and Jane Johnson talk about Smith's car."

The weak point is having more than one non-word character between “John” and “Smith”. Unfortunately a quantifier like + for \W would raise “Variable length lookbehind not implemented” error.

Related Question