Ubuntu – Find images in a Linux directory based on their resolution

bashcommand linefindimage processingsearch

I would like to scan all images in a directory (recursively within sub-folders), and find those with resolution higher than a specific threshold (e.g. say those with resolutions at least 800x600 or if easier, say with width higher than 1000 pixels). Then I would like to log their address in a text file, accompanying their resolution (or [width], [height] for a better formatting).

So log.txt would look like this:

/home/users/myuser/test/image1.jpg, 1800, 1600
/home/users/myuser/test/image20.jpg, 2800, 2600
/home/users/myuser/test/image30.jpg, 1500, 1200

How can I do that that using a bash script? I have to scan millions of images.

Best Answer

Via bash's recursive glob and ImageMagick's identify command:

shopt -s globstar
identify -format "%f, %w, %h\n" **/*.{png,jpg,jpeg}

Saving such output to file , is just a matter of adding > mylog.txt to previous command, that is

identify -format "%f, %w, %h\n" **/*.{png,jpg,jpeg} > mylog.txt

From there, you could use awk or perl to compare mylog.txt columns

awk -F ',' '$2 > 800 && $3 > 600' mylog.txt

awk here uses , as separator for columns, and the usual structure for awk is /PATTERN/{COMMANDS}, which defaults to just printing if {COMMANDS} omitted ; in the particular example above, if the pattern $2 > 800 && $3 > 600 is true, that is it's the image above desired resolution, you'll get it printed to the screen.

And probably skipping the log step in between, it would be a little better to just pipe everything:

shopt -s globstar
identify -format "%f, %w, %h\n" **/*.{png,jpg,jpeg} | awk -F ',' '$2 > 800 && $3 > 600' > filtered_images.txt

In case you encounter arguments list too long error, typically find command is better approach for recursively walking the directory tree. The identify can be called through find's -exec flag, and filtering still can be handled by awk:

$ find -type f -regex "^.*\.\(png\|jpg\|jpeg\)$" -exec identify -format "%f, %w, %h\n" {} \; | awk -F ',' '$2 > 800 && $3 > 600' 
fanart.jpg, 1920, 1080
fanart.jpg, 1920, 1080
globalsearch-background.jpg, 1920, 1080
fanart.jpg, 1280, 720

As usual, don't forget to add > log2.txt to save everything to file.

Full path of to the file could be handled in either one of two ways. One, by specifying %d/%f in identify command's format string, or use find's -printf option. That is either

find -type f -regex "^.*\.\(png\|jpg\|jpeg\)$" -exec identify -format "%d/%f, %w, %h\n" {} \; | awk -F ',' '$2 > 800 && $3 > 600'

Or

find -type f -regex "^.*\.\(png\|jpg\|jpeg\)$" -printf "%p, " -exec identify -format "%w, %h\n" {} \; | awk -F ',' '$2 > 800 && $3 > 600'
Related Question