I know that this has been asked before, but this is just a little bit different: I need to remove all comments, excluding escaped #
or otherwise not meant as starting a comment (in-between single or double apices)
Starting with the following text:
test
# comment
comment on midline # comment
escaped hash "\# this is an escaped hash"
escaped hash "\\# this is not a comment"
not a comment "# this is not a comment - double apices"
not a comment '# this is not a comment - single apices'
this is a comment \\# this is a comment
this is not a comment \# this is not a comment
I would like to obtain
test
comment on midline
escaped hash "\# this is an escaped hash"
escaped hash "\\# this is not a comment"
not a comment "# this is not a comment - double apices"
not a comment '# this is not a comment - single apices'
this is a comment \\
this is not a comment \# this is not a comment
I tried
grep -o '^[^#]*' file
but this also deletes escaped hashes.
NOTE: text I'm working on does have escaped #
(\#
) but it lacks double escaped #
(\\#
), so it does not matter to me if they are kept or not. I guess it's more neat to delete them as as a matter of fact the hash is not escaped.
Best Answer
With
sed
you could delete the lines that start with a#
(preceded by zero or more blanks) and remove all strings starting with#
that doesn't follow a single backslash (and only if it's not in-between quotes1):1: this solution assumes a single pair of quotes on a line