Grep doesn’t match carriage return characters

cygwin;grepnewlines

I'm trying to find lines with the carriage return character, but I'm not getting the results I'd expect. I've whittled it down to this proof-of-concept:

$ uname -a
CYGWIN_NT-6.1 Aodh 2.0.4(0.287/5/3) 2015-06-09 12:22 x86_64 Cygwin

$ grep --version
grep (GNU grep) 2.21
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.

$ od -c cr_poc.txt
0000000   h   e   l   l   o       w   o   r   l   d   ;  \r  \n  \r  \n
0000020

$ od -x cr_poc.txt
0000000 6568 6c6c 206f 6f77 6c72 3b64 0a0d 0a0d
0000020

$ grep $'\r' cr_poc.txt; echo $?
1

I've tried various other ways of grepping for the \r character, but none have worked.

Notice this is on Cygwin, which certainly could be part of the problem.

Best Answer

Poking around with various inputs, I felt grep did its own magic for line-endings:

$ printf "foo\rbar\n" | grep -oz $'\r' | od -c
0000000  \r  \n
0000002
$ printf "foo\rbar\r\n" | grep -oz $'\r' | od -c
0000000
$ printf "foo\rbar\r" | grep -oz $'\r' | od -c
0000000  \r  \n  \r  \n
0000004

(The -z was my lame attempt to make grep match everything.) And so I searched the manpage for LF, leading me to:

-U, --binary
      Treat the file(s) as binary.  By default, under MS-DOS  and  MS-
      Windows,  grep  guesses the file type by looking at the contents
      of the first 32KB read from the file.  If grep decides the  file
      is  a  text  file, it strips the CR characters from the original
      file contents (to make regular expressions with  ^  and  $  work
      correctly).  Specifying -U overrules this guesswork, causing all
      files to be read and passed to the matching mechanism  verbatim;
      if  the  file is a text file with CR/LF pairs at the end of each
      line, this will cause some regular expressions  to  fail.   This
      option  has  no  effect  on  platforms other than MS-DOS and MS-
      Windows.
Related Question