Bash – Efficient way to search string within file find and grep

bashcommand linefindgrepunix

I am searching all files containing a specific string on a filer (on an old HP-UX workstation).

I do not know where the files are located in the file system (there are many directories, with hudge number of scripts, plain-text and binary files).

I precise that the grep -R option does not exist on this system; so I am using find and grep in order to retrieve which files contains my string:

find . -type f -exec grep -i "mystring" {} \;

I am not satisfied with this command: it is too slow, and it does not print the name and path of file on which grep matched my string.
Moreover if there is an error it will be echoed on my console output.

So I thought that I could do better:

find . -type f -exec grep -l -i "mystring" {} 2>/dev/null \;

But it is very slow.

Do you have a more efficient alternative to this command?

Thanks you.

Best Answer

The fastest I can come up with is to use xargs to share the load:

find . -type f -print0  | xargs -0 grep -Fil "mypattern" 

Running some benchmarks on a directory containing 3631 files:

$ time find . -type f -exec grep -l -i "mystring" {} 2>/dev/null \;

real    0m15.012s
user    0m4.876s
sys     0m1.876s

$ time find . -type f -exec grep -Fli "mystring" {} 2>/dev/null \;

real    0m13.982s
user    0m4.328s
sys     0m1.592s


$ time find . -type f -print0  | xargs -0 grep -Fil "mystring" >/dev/null 

real    0m3.565s
user    0m3.508s
sys     0m0.052s

Your other options would be to streamline either by limiting the file list using find:

   -executable
          Matches files which are executable and  direcā€
          tories  which  are  searchable (in a file name
          resolution sense).  
   -writable
          Matches files which are writable.             

   -mtime n
          File's  data was last modified n*24 hours ago.
          See the comments for -atime to understand  how
          rounding  affects  the  interpretation of file
          modification times.
   -group gname
          File  belongs to group gname (numeric group ID
          allowed).
   -perm /mode
          Any  of  the  permission bits mode are set for
          the file.  Symbolic modes are accepted in this
          form.  You must specify `u', `g' or `o' if you
          use a symbolic mode. 
   -size n[cwbkMG]  <-- you can set a minimum or maximum size
          File uses n units  of  space.  

Or by tweaking grep:

You are already using grep's -l option which cause the file name to be printed and, more importantly, stops at the first match:

   -l, --files-with-matches
       Suppress normal output; instead print the name of each input file  from
       which  output would normally have been printed.  The scanning will stop
       on the first match.  (-l is specified by POSIX.)

The only other thing I can think of to speed things up would be to make sure your pattern is not interpreted as a regex (as suggested by @suspectus) by using the -F option.

Related Question