Colored grep output: Not GREP_OPTIONS not alias

colorsgrep

I want colored output of grep.

…. But

  • Strategy 1: GREP_OPTIONS. But this is deprecated. See http://www.gnu.org/software/grep/manual/html_node/Environment-Variables.html
  • Stragegy 2: GREP_COLORS look like a solution at the first sight, but this does something different.
  • Strategy 3: alias. This does not work for find ... | xargs grep, since xargs does not evaluate aliases.
  • Strategy 4: Write a simple wrapper script. No, I think this is too dirty and makes more trouble than it solves.
  • Strategy 5: patch the source code
  • Strategy 6: Contact grep developers, ask for a replacement of GREP_OPTIONS
  • Strategy NICE-and-EASY: … this is missing. I have no clue.

How to solve this?

Best Answer

Some of the reasons OP has stated the options are unsuitable have no basis in reality. Here, I show what kind of effects using OP's strategy 4 has:


On most distributions, grep is installed in /bin (typical) or /usr/bin (OpenSUSE, maybe others), and default PATH contains /usr/local/bin before /bin or /usr/bin. This means that if you create /usr/local/bin/grep with

#!/bin/sh
exec /bin/grep --color=auto "$@"

where /bin/sh is a POSIX-compatible shell provided by your distribution, usually bash or dash. If grep is in /usr/bin, then make that

#!/bin/sh
exec /usr/bin/grep --color=auto "$@"

The overhead of this script is minimal. The exec statement means that the script interpreter is replaced by the grep binary; this means that the shell does not remain in memory while grep is being executed. Thus, the only overhead is one extra execution of the script interpreter, i.e. a small latency in wall clock time. The latency is roughly constant (varies only depending on whether grep and sh are already in page cache or not, and on how much I/O bandwidth is available), and does not depend on how long grep executes or how much data it processes.

So, how long is that latency, i.e. the overhead added by the wrapper script?

To find out, create the above script, and run

time /bin/grep --version
time /usr/local/bin/grep --version

On my machine, the former takes 0.005s real time (across a large number of runs), whereas the latter takes 0.006s real time. Thus, the overhead of using the wrapper on my machine is 0.001s (or less) per invocation.

This is insignificant.

I also fail to see anything "dirty" about this, because many common applications and utilities use the same approach. To see the list of such on your machine in /bin and /usr/bin, just run

file /bin/* /usr/bin/* | sed -ne 's/:.*shell script.*$//p'

On my machine, the above output includes egrep, fgrep, zgrep, which, 7z, chromium-browser, ldd, and xfig, which I use quite often. Unless you consider your entire distribution "dirty" for relying on wrapper scripts, you have no reason to consider such wrapper scripts "dirty".


As to problems such a wrapper script may cause:

If only human users (as opposed to scripts) are using the version of grep that defaults to color support if output is to a terminal, then the wrapper script can be named colorgrep or cgrep or whatever the OP sees fit.

This avoids all possible compatibility issues, because the behaviour of grep does not change at all.


Enabling grep options with a wrapper script, but in a way that avoids any new problems:

We can easily rewrite the wrapper script to support a custom GREP_OPTS even if GREP_OPTIONS were not supported (as it is already deprecated). This way users can simply add export "GREP_OPTIONS=--color=auto" or similar to their profile. /usr/local/bin/grep is then

#!/bin/sh
exec /bin/grep $GREP_OPTIONS "$@"

Note that there are no quotes around $GREP_OPTIONS, so that users can specify more than one option.

On my system, executing time /usr/local/bin/grep --version with GREP_OPTIONS empty, or with GREP_OPTIONS=--color=auto, is just as fast as the previous version of the wrapper script; i.e., typically takes one millisecond longer to execute than plain grep.

This last version is the one I'd personally recommend for use.


In summary, OP's strategy 4:

  • is aready recommended by grep developers

  • is trivial to implement (two lines)

  • has insignificant overhead (one millisecond extra latency per invocation on this particular laptop; easily verified on each machine)

  • can be implemented as a wrapper script that adds GREP_OPTS support (to replace deprecated/unsupported GREP_OPTIONS)

  • can be implemented (as colorgrep/cgrep) that does not affect scripts or existing users at all

Because it is a technique that is widely used in Linux distributions already, it is a common technique and not "dirty".

If implemented as a separate wrapper (colorgrep/cgrep), it cannot create new problems since it does not affect grep behaviour at all. If implemented as a wrapper script that adds GREP_OPTS support, using GREP_OPTS=--color=auto has exactly the same risks (wrt. problems with existing scripts) that upstream adding default --color=auto would. Thus, the comment that this "creates more problems than it solves" is completely incorrect: no additional problems are created.