How to make grep ignore lines without trailing newline character

grep

I'd like to grep a file for a string, but ignore any matches on lines that do not end with a trailing newline character. In other words, if the file does not end with a newline character, I'd like to ignore the last line of the file.

What is the best way to do this?

I encountered this issue in a python script that calls grep via the subprocess module to filter a large text log file before processing. The last line of the file might be mid-write, in which case I don't want to process that line.

Best Answer

grep is explicitly defined to ignore newlines, so you can't really use that. sed knows internally if the current line (fragment) ends in a newline or not, but I can't see how it could be coerced to reveal that information. awk separates records by newlines (RS), but doesn't really care if there was one, the default action of print is to print a newline (ORS) at the end in any case.

So the usual tools don't seem too helpful here.

However, sed does know when it's working on the last line, so if you don't mind losing the last intact line in cases where a partial one isn't seen, you could just have sed delete what it thinks is the last one. E.g.

sed -n -e '$d' -e '/pattern/p'  < somefile                   # or
< somefile sed '$d' | grep ...

If that's not an option, then there's always Perl. This should print only the lines that match /pattern/, and have a newline at the end:

perl -ne 'print if /pattern/ && /\n$/'

Related Solutions

Grep – How to Display Lines 2-4 After Each Grep Result

The simplest way to solve it using grep only, is to pipe one more inverted grep at the end. For example:

grep -A 4 "The mail system" temp.txt | grep -v "The mail system" | grep -v '^\d*$'

How to search for text in a file ignoring newlines

The GNU grep can do it

grep -z 'is\san\sexample\sfile.' file

To fulfill some points which arise in comments there are some modifications to script:

 grep -oz '^[^\n]*\bis\s*an\s*example\s*file\.[^\n]*' file

Regarding huge files I have no imagination of memory limitation but in the case of problem you are free to use sed

sed '/\bis\b/{
          :1
          N
          /file\.\|\(\n.*\)\{3\}/!b1
         }
     /\<is\s*an\s*example\s*file\./p
     D' file

that keep no more than 4-lines (because 4 words in pattern) in memory (\(\n.*\)\{3\}).

Best Answer

Related Solutions

Grep – How to Display Lines 2-4 After Each Grep Result

How to search for text in a file ignoring newlines

Related Question