Recursive grep for words in a particular file type

command linegrep

I wanted a command line command to search all shell scripts in the filesystem for a particular word, so I asked around at work and got the following solutions:

grep word `find / -name \*.sh 2>/dev/null`
find / -name "*.sh" 2>/dev/null | xargs grep word

However, I'm not that familiar with the command line, so both of these solutions seem opaque to me. I'd prefer to do something that looks like:

ls -r *.sh | cat | grep -H word

But it seems that you can't pipe filenames into cat (at least I think that's what the problem is).

What is the most legible solution? And secondly, what is the most efficient solution?

Edit: I needed to know which file the word was found in, so I could modify the script.

Best Answer

Edit: If you have GNU utilities, see Gilles' answer for a method using GNU grep's recursion abilities that is much simpler than the find approach. If you only want to display filenames, you'll still want to add the -l option as I describe below.


Use grep -l word to only print names of files containing a match.

If you want to find all files in the file system ending in .sh, starting at the root /, then find is the most appropriate tool.

The most portable and efficient recommendation is:

find / -type f -name '*.sh' -exec grep -l word {} + 2>/dev/null

This is about as readable as it gets, and is not hard to parse if you understand the semantics behind each of the components.

  • find /: run find starting at the file system root, /
  • -type f: only match regular files
  • -name '*.sh': ... and only match files whose names end in .sh
  • -exec ... {} +: run command specified in ... on matched files in groups, where {} is replaced by the file names in the group. The idea is to run the command on as many files at once as possible within the limits of the system (ARG_MAX). The efficiency of the {} + form comes from minimizing the number of times the ... command must be called by maximizing the number of files passed to each invocation of ....
  • grep -l word {}: where the {} is the same {} repeated from above and is replaced by file names. As previously explained, grep -l prints the names of files containing a match for word.
  • 2>/dev/null: hide error messages (technically, redirect standard error to the black hole that is /dev/null). This is for aesthetic and practical reasons, since running find on / will likely result in reams of "permission denied" messages you may not care about for files which you do not have permission to read and directories you do not have permission to traverse.

There are some problems with the suggestions you received and posted in your question. Both

grep word `find / -name \*.sh 2>/dev/null

and

find / -name "*.sh" 2>/dev/null | xargs grep word

fail on files with whitespace in their name. It's best to avoid putting filenames in command substitution altogether. The first one has the additional problem of potentially running into the ARG_MAX limit. The second one is close to what I suggest, but there is no good reason to use xargs here, not to mention that safe and correct usage of xargs requires sacrificing portability for some GNU-only options (find -print0 | xargs -0).