Bash – Delete all but largest file of specific type

bashcygwin;zsh

I am trying to organise the album art in my music collection so that only one image is assigned to each folder.

My directory structure currently looks like:

/path/to/music/Album Name/
/path/to/music/Album Name/1 - Track one.flac
...
/path/to/music/Album Name/cover.jpg (either this)
/path/to/music/Album Name/folder.jpg (or this)
/path/to/music/Album Name/Album Name.jpg (or this is the largest file)
/path/to/music/Album Name/AlbumArtSmall.jpg  

(plus other low resolution images generated by Windows media player)

I would like to scan through each folder and delete all but the largest jpg and rename it to cover.jpg.

As the tags indicate, I have cygwin installed, but can also boot into Ubuntu where I have access to bash and zsh, if this makes the problem easier.

Best Answer

In zsh (which you can use from Cygwin or Linux), you can use glob qualifiers to pick the largest file. That's the largest file by byte size, not in terms of image dimensions ­— which is probably the right thing here since it privileges high-resolution images.

for d in /path/to/music/**/*(/); do
  rm -f $d/*.jpg(oL[1,-2]N)
  mv $d/*.jpg $d/cover.jpg
done

The loop traverses all the subdirectories of /path/to/music recursively. The (/) suffix restricts the matches to directories. The argument to rm -f use three glob qualifiers: oL to sort by size; [1,-2] to retain only the matches up to the next-to-last one (PATTERN([-1]) is the last match, PATTERN([-2]) is the next-to-last match, and PATTERN([1,-2]) is the list of matches from the first to the next-to-last inclusive); and N to produce an empty list rather than leave the pattern unexpanded or report an error if the pattern matches no file.

You may get harmless error if the remaining file is already called cover.jpg or if there is no .jpg file in a directory. To avoid them, change the mv call to

[[ -e $d/cover.jpg ]] || mv $d/*.jpg $d/cover.jpg

Here's an alternative method that renames first then deletes. It uses the PATTERN1~PATTERN2 syntax, which requires the extended_glob option, to select files that match PATTERN1 but not PATTERN2. ((#jpgs)) tests if the jpgs array contains at least one element.

setopt extended_glob
for d in /path/to/music/**/*(/); do
  jpgs=($d/*.jpg(oL))
  ((#jpgs)) || continue
  [[ $jpgs[1] == */cover.jpg ]] || mv $jpgs[1] $d/cover.jpg
  rm -f $jpgs[2,-1]
done