grep – Can grep Return True/False or Alternative Methods?

grep

As a part of this script, I need to be able to check if the first argument given matches the first word of file. If it does, exit with an error message; if it doesn't, append the arguments to the file. I understand how to write the if statement, but not how to use grep within a script. I understand that grep will look something like this

grep ^$1 schemas.txt

I feel like this should be much easier than I am making it.

I'm getting an error "too many arguments" on the if statement. I got rid of the space between grep -q and then got an error binary operator expected.

if [ grep -q ^$1 schemas.txt ]
then
        echo "Schema already exists. Please try again"
        exit 1
else
        echo "$@" >> schemas.txt
fi

Best Answer

grep returns a different exit code if it found something (zero) vs. if it hasn't found anything (non-zero). In an if statement, a zero exit code is mapped to "true" and a non-zero exit code is mapped to false. In addition, grep has a -q argument to not output the matched text (but only return the exit status code)

So, you can use grep like this:

if grep -q PATTERN file.txt; then
    echo found
else
    echo not found
fi

As a quick note, when you do something like if [ -z "$var" ]…, it turns out that [ is actually a command you're running, just like grep. On my system, it's /usr/bin/[. (Well, technically, your shell probably has it built-in, but that's an optimization. It behaves as if it were a command). It works the same way, [ returns a zero exit code for true, a non-zero exit code for false. (test is the same thing as [, except for the closing ])

Related Solutions

Grep Performance – Search in Thousands of Files Efficiently

With find:

cd /the/dir
find . -type f -exec grep pattern {} +

(-type f is to only search in regular files (also excluding symlinks even if they point to regular files). If you want to search in any type of file except directories (but beware there are some types of files like fifos or /dev/zero that you generally don't want to read), replace -type f with the GNU-specific ! -xtype d (-xtype d matches for files of type directory after symlink resolution)).

With GNU grep:

grep -r pattern /the/dir

(but beware that unless you have a recent version of GNU grep, that will follow symlinks when descending into directories). Non-regular files won't be searched unless you add a -D read option. Recent versions of GNU grep will still not search inside symlinks though.

Very old versions of GNU find did not support the standard {} + syntax, but there you could use the non-standard:

cd /the/dir &&
  find . -type f -print0 | xargs -r0 grep pattern

Performances are likely to be I/O bound. That is the time to do the search would be the time needed to read all that data from storage.

If the data is on a redundant disk array, reading several files at a time might improve performance (and could degrade them otherwise). If the performances are not I/O bound (because for instance all the data is in cache), and you have multiple CPUs, concurrent greps might help as well. You can do that with GNU xargs's -P option.

For instance, if the data is on a RAID1 array with 3 drives, or if the data is in cache and you have 3 CPUs whose time to spare:

cd /the/dir &&
  find . -type f -print0 | xargs -n1000 -r0P3 grep pattern

(here using -n1000 to spawn a new grep every 1000 files, up to 3 running in parallel at a time).

However note that if the output of grep is redirected, you'll end up with badly interleaved output from the 3 grep processes, in which case you may want to run it as:

find . -type f -print0 | stdbuf -oL xargs -n1000 -r0P3 grep pattern

(on a recent GNU or FreeBSD system) or use the --line-buffered option of GNU grep.

If pattern is a fixed string, adding the -F option could improve matters.

If it's not multi-byte character data, or if for the matching of that pattern, it doesn't matter whether the data is multi-byte character or not, then:

cd /the/dir &&
  LC_ALL=C grep -r pattern .

could improve performance significantly.

If you end up doing such searches often, then you may want to index your data using one of the many search engines out there.

How to find files that contain one criterion but exclude a different criterion

Since the exit status of grep indicates whether or not it found a match, you should be able to test that directly as a find predicate (with the necessary negation, ! or -not) e.g.

find . -type f -name "*.c" \( -exec grep -q "ABC" {} \; ! -exec grep -q "123" {} \; \) -print

-q makes grep exit silently on the first match - we don't need to hear from it because we let find print the filename.

Best Answer

Related Solutions

Grep Performance – Search in Thousands of Files Efficiently

How to find files that contain one criterion but exclude a different criterion

Related Question