I'm looking for a way to remove one specific line from a bunch of files, but only if it occurs more than once in that file. Other lines should be kept, even if they are duplicates.
For example, a file like this where I would like to remove the duplicates of AAA
AAA
BBB
AAA
BBB
CCC
should become
AAA
BBB
BBB
CCC
I guess I should use sed
but I have no idea how to write the command.
Best Answer
With GNU
sed
:That is, let everything through (
b
branches off like acontinue
) up to the firstAAA
(from the 0th line (that is even before the first line) and the first one matching/^AAA$/
(which could be the first line)), and then for the remaining lines, delete every occurrence ofAAA
(an empty//
pattern reuses the last pattern).GNU
sed
is needed for the0
address (and the ability to have other commands after theb
one in the same expression, though that could be easily worked around in other implementations by using two-e
expressions)With
awk
:(or for a regexp pattern:
awk '!/^AAA$/ || !n++'
)a shorthand for: