I'd like to grep a file for a string, but ignore any matches on lines that do not end with a trailing newline character. In other words, if the file does not end with a newline character, I'd like to ignore the last line of the file.
What is the best way to do this?
I encountered this issue in a python script that calls grep via the subprocess
module to filter a large text log file before processing. The last line of the file might be mid-write, in which case I don't want to process that line.
Best Answer
grep
is explicitly defined to ignore newlines, so you can't really use that.sed
knows internally if the current line (fragment) ends in a newline or not, but I can't see how it could be coerced to reveal that information.awk
separates records by newlines (RS
), but doesn't really care if there was one, the default action ofprint
is to print a newline (ORS
) at the end in any case.So the usual tools don't seem too helpful here.
However,
sed
does know when it's working on the last line, so if you don't mind losing the last intact line in cases where a partial one isn't seen, you could just havesed
delete what it thinks is the last one. E.g.If that's not an option, then there's always Perl. This should print only the lines that match
/pattern/
, and have a newline at the end: