Troubleshooting Text File Marked as Binary – Linux, Grep

greplinux

I have an executable that generates a text file as its output. The problem is that the text file comes out with a binary file flag of some sort. The result is something like this:

$ grep "grep string" output_file.txt
Binary file output_file.txt matches.

$ grep -a "grep string" output_file.txt
[correct results]

Some reading has indicated that grep looks for a null character in the first thousand or so bytes, then determines from that whether or not a file is 'binary', so my question is two-fold:

  1. Is there an easy way to strip null characters from my files (I can do this as part of my post-processing) to ensure that grep works correctly without the -a flag?

  2. Is there something obvious I should look for in my code to prevent null characters from being written to the file? I've looked through the code quite thoroughly and I don't see any obvious culprits.

    .

Best Answer

I can answer at least the first question. If you're using Unix/Linux you can use tr

tr -d '\000' < filein > fileout

where \000 is the null char. You can also strip all non-printable chars as you can see on the example here: "Unix Text Editing: sed, tr, cut, od, awk"

Regarding your second question, I don't know which is your programming language but I'd search for uninitialized variables which could be end being printed to the output file.

Related Question