The fastest I can come up with is to use xargs
to share the load:
find . -type f -print0 | xargs -0 grep -Fil "mypattern"
Running some benchmarks on a directory containing 3631 files:
$ time find . -type f -exec grep -l -i "mystring" {} 2>/dev/null \;
real 0m15.012s
user 0m4.876s
sys 0m1.876s
$ time find . -type f -exec grep -Fli "mystring" {} 2>/dev/null \;
real 0m13.982s
user 0m4.328s
sys 0m1.592s
$ time find . -type f -print0 | xargs -0 grep -Fil "mystring" >/dev/null
real 0m3.565s
user 0m3.508s
sys 0m0.052s
Your other options would be to streamline either by limiting the file list using find
:
-executable
Matches files which are executable and direcā
tories which are searchable (in a file name
resolution sense).
-writable
Matches files which are writable.
-mtime n
File's data was last modified n*24 hours ago.
See the comments for -atime to understand how
rounding affects the interpretation of file
modification times.
-group gname
File belongs to group gname (numeric group ID
allowed).
-perm /mode
Any of the permission bits mode are set for
the file. Symbolic modes are accepted in this
form. You must specify `u', `g' or `o' if you
use a symbolic mode.
-size n[cwbkMG] <-- you can set a minimum or maximum size
File uses n units of space.
Or by tweaking grep
:
You are already using grep
's -l
option which cause the file name to be printed and, more importantly, stops at the first match:
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from
which output would normally have been printed. The scanning will stop
on the first match. (-l is specified by POSIX.)
The only other thing I can think of to speed things up would be to make sure your pattern is not interpreted as a regex (as suggested by @suspectus) by using the -F
option.
Best Answer
You're close. To get a total count of all occurrences of "ha" within all .txt files in a folder:
From
man grep
:This works because each match is printed on a separate line, thus allowing
wc -l
to count all of them.By default, however, grep only finds the first occurrence on a line and outputs the whole line. Likewise, option
-c
only finds the first occurrence in all lines, then outputs how many lines had 1 (or more) matches.EDIT:
Here is a way to print the total number of occurrences within each individual file (with filenames):
Explanation:
find *.txt
- finds .txt files-printf
- prints everything between the single-quotes (formatted) to standard output, replacing occurrences of%p
with find's output (file names)$(grep -o "ha" %p | wc -l)
- works as above| sh
- the output from-printf
(which are commands) are piped to a shell and executedNote that printf is invoked once per filename.