You could do the following using some implementations of find
and xargs
like this.
$ find . -type f -print0 | xargs -r0 ./myscript
or, standardly, just find
:
$ find . -type f -exec ./myscript {} +
Example
Say I have the following sample directory.
$ tree
.
|-- dir1
| `-- a\ file1.txt
|-- dir2
| `-- a\ file2.txt
|-- dir3
| `-- a\ file3.txt
`-- myscript
3 directories, 4 files
Now let's say I have this for ./myscript
.
#!/bin/bash
for i in "$@"; do
echo "file: $i"
done
Now when I run the following command.
$ find . -type f -print0 | xargs -r0 ./myscript
file: ./dir2/a file2.txt
file: ./dir3/a file3.txt
file: ./dir1/a file1.txt
file: ./myscript
Or when I use the 2nd form like so:
$ find . -type f -exec ./myscript {} +
file: ./dir2/a file2.txt
file: ./dir3/a file3.txt
file: ./dir1/a file1.txt
file: ./myscript
Details
find + xargs
The above 2 methods, though looking different, are essentially the same. The first is taking the output from find, splitting it using NULLs (\0
) via the -print0
switch to find. The xargs -0
is specifically designed to take input that's split using NULLs. That non-standard syntax was introduced by GNU find
and xargs
but is also found nowadays in a few others like most recent BSDs. The -r
option is required to avoid calling myscript
if find
finds nothing with GNU find
but not with BSDs.
NOTE: This entire approach hinges on the fact that you'll never pass a string that's exceedingly long. If it is, then a 2nd invocation of ./myscript
will get kicked off with the remainder of subsequent results from find.
find with +
That's the standard way (though it was only added relatively recently (2005) to the GNU implementation of find
). The ability to do what we're doing with xargs
is literally built into find
. So find
will find a list of files and then pass that list as as many arguments as can fit to the command specified after -exec
(note that {}
can only be last just before +
in this case), running the commands several times if needed.
Why no quoting?
In the first example we're taking a shortcut by completely avoiding the issues with the quoting, by using NULLs to separate the arguments. When xargs
is given this list it's instructed to split on the NULLs effectively protecting our individual command atoms.
In the second example we're keeping the results internal to find
and so it knows what each file atom is, and will guarantee to handle them appropriately, thereby avoiding the whoie business of quoting them.
Maximum size of command line?
This question comes up from time to time so as a bonus I'm adding it to this answer, mainly so I can find it in the future. You can use xargs
to see what the environment's limit like so:
$ xargs --show-limits
Your environment variables take up 4791 bytes
POSIX upper limit on argument length (this system): 2090313
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2085522
Size of command buffer we are actually using: 131072
find . -type f -name '*f*' | sed -r 's|/[^/]+$||' |sort |uniq
The above finds all files below the current directory (.
) that are regular files (-type f
) and have f
somewhere in their name (-name '*f*'
). Next, sed
removes the file name, leaving just the directory name. Then, the list of directories is sorted (sort
) and duplicates removed (uniq
).
The sed
command consists of a single substitute. It looks for matches to the regular expression /[^/]+$
and replaces anything matching that with nothing. The dollar sign means the end of the line. [^/]+'
means one or more characters that are not slashes. Thus, /[^/]+$
means all characters from the final slash to the end of the line. In other words, this matches the file name at the end of the full path. Thus, the sed command removes the file name, leaving unchanged the name of directory that the file was in.
Simplifications
Many modern sort
commands support a -u
flag which makes uniq
unnecessary. For GNU sed:
find . -type f -name '*f*' | sed -r 's|/[^/]+$||' |sort -u
And, for MacOS sed:
find . -type f -name '*f*' | sed -E 's|/[^/]+$||' |sort -u
Also, if your find
command supports it, it is possible to have find
print the directory names directly. This avoids the need for sed
:
find . -type f -name '*f*' -printf '%h\n' | sort -u
More robust version (Requires GNU tools)
The above versions will be confused by file names that include newlines. A more robust solution is to do the sorting on NUL-terminated strings:
find . -type f -name '*f*' -printf '%h\0' | sort -zu | sed -z 's/$/\n/'
Best Answer
From the comments I get something similar like this is your command:
Instead of piping to a
while
- loop you could use-exec sh -c '...'
to filter files:Try:
Consider three files:
Output: