find Command – How to Limit Number of Matches

find

If I want find command to stop after finding a certain number of matches, how do I do that?

Background is that I have too many files in a folder, I need to put them into separate folders randomly like:

find -max-matches 1000 -exec mv {} /path/to/collection1 \+; 
find -max-matches 1000 -exec mv {} /path/to/collection2 \+; 

is this possible to do with find alone? If not, what would be the simplest way to do this?

Best Answer

As you're not using find for very much other than walking the directory tree, I'd suggest instead using the shell directly to do this. See variations for both zsh and bash below.


Using the zsh shell

mv ./**/*(-.D[1,1000]) /path/to/collection1    # move first 1000 files
mv ./**/*(-.D[1,1000]) /path/to/collection2    # move next 1000 files

The globbing pattern ./**/*(-.D[1,1000]) would match all regular files (or symbolic links to such files) in or under the current directory, and then return the 1000 first of these. The -. restricts the match to regular files or symbolic links to these, while D acts like dotglob in bash (matches hidden names).

This is assuming that the generated command would not grow too big through expanding the globbing pattern when calling mv.

The above is quite inefficient as it would expand the glob for each collection. You may therefore want to store the pathnames in an array and then move slices of that:

pathnames=( ./**/*(-.D) )

mv $pathnames[1,1000]    /path/to/collection1
mv $pathnames[1001,2000] /path/to/collection2

To randomise the pathnames array when you create it (you mentioned wanting to move random files):

pathnames=( ./**/*(-.Doe['REPLY=$RANDOM']) )

You could do a similar thing in bash (except you can't easily shuffle the result of a glob match in bash, apart for possibly feeding the results through shuf, so I'll skip that bit):

shopt -s globstar dotglob nullglob

pathnames=()
for pathname in ./**/*; do
    [[ -f $pathname ]] && pathnames+=( "$pathname" )
done

mv "${pathnames[@]:0:1000}"    /path/to/collection1
mv "${pathnames[@]:1000:1000}" /path/to/collection2
mv "${pathnames[@]:2000:1000}" /path/to/collection3
Related Question