How to make grep ignore lines without trailing newline character

grep

I'd like to grep a file for a string, but ignore any matches on lines that do not end with a trailing newline character. In other words, if the file does not end with a newline character, I'd like to ignore the last line of the file.

What is the best way to do this?

I encountered this issue in a python script that calls grep via the subprocess module to filter a large text log file before processing. The last line of the file might be mid-write, in which case I don't want to process that line.

Best Answer

grep is explicitly defined to ignore newlines, so you can't really use that. sed knows internally if the current line (fragment) ends in a newline or not, but I can't see how it could be coerced to reveal that information. awk separates records by newlines (RS), but doesn't really care if there was one, the default action of print is to print a newline (ORS) at the end in any case.

So the usual tools don't seem too helpful here.

However, sed does know when it's working on the last line, so if you don't mind losing the last intact line in cases where a partial one isn't seen, you could just have sed delete what it thinks is the last one. E.g.

sed -n -e '$d' -e '/pattern/p'  < somefile                   # or
< somefile sed '$d' | grep ...

If that's not an option, then there's always Perl. This should print only the lines that match /pattern/, and have a newline at the end:

perl -ne 'print if /pattern/ && /\n$/'
Related Question