Unix : how to tar only N first files of each folder

tar

I have a folder containing 2Gb of images, with sub-folders several levels deep.

I'd like to archive only N files of each (sub) folder in a tar file. I tried to use find then tail then tar but couldn't manage to get it to work. Here is what I tried (assuming N = 10):

find . | tail -n 10 | tar -czvf backup.tar.gz

… which outputs this error:

Cannot stat: File name too long

What's wrong here? thinking of it – even if it works I think it will tar only the first 10 files of all folders, not the 10 files of each folder.

How can I get N files of each folder? (No file order needed )

Best Answer

If your pax supports the -0 option, with zsh:

print -rN dir/**/*(D/e:'reply=($REPLY/*(ND^/[1,10]))':) |
  pax -w0 | xz > file.tar.xz

It includes the first 10 non-directory files of each directory in the list sorted by file name. You can choose a different sorting order by adding the om glob qualifier (order by modification time, Om to reverse the order), oL (order by length), non (sort by name but numerically)...

If you don't have the standard pax command, or it doesn't support -0 but you have the GNU tar command, you can do:

print -rN -- dir/**/*(D/e:'reply=($REPLY/*(ND^/[1,10]))':) |
  tar --null -T - -cjf file.tar.xz

If you can't use zsh, but have access to bash (the shell of the GNU project), you could do:

find dir -type d -exec bash -O nullglob -O dotglob -c '
  for dir do
    set -- "$dir/*"; n=0
    for file do
      if [ ! -d "$file" ] || [ -L "$file" ]; then
        printf "%s\0" "$file"
        (( n++ < 10 )) || break
      fi
    done
  done' bash {} + | pax -0w | xz > file.tar.xz

That would be significantly less efficient though.

Related Question