I'm trying to tidy-up my photos which are, for various historic reasons, scattered all over my system. To enable me to make a start on this task, I've been trying to use the command line to construct a list of all directories that contain one or more jpg files. I'm certain that I don't have to be concerned about looking for other image file formats, but I do have to allow for jpg appearing in upper and lower case.
I'd like each directory name to appear only once in the final list. To provide an example, if I have the following directories each of which contain one or more jpg or JPG files….
~Mike/Pictures
~Mike/Pictures/London/Olympics
~Mike/Pictures/London
~Mike/Pictures/London/Holiday
~Mike/Photos
~Mike/Family History/Swaine
I'd like the results to appear with each directory listed only once – irrespective of the number of image files it might contain – preferably sorted and then written to a file
~Mike/Family History/Swaine
~Mike/Photos
~Mike/Pictures
~Mike/Pictures/London
~Mike/Pictures/London/Holiday
~Mike/Pictures/London/Olympics
My command line skills are just not up to this! I can use a lot of the simpler forms of single commands, but once they get complex and/or have to be piped things tend to go wrong.
Best Answer
Assuming JPEG image files have the suffix
.jpg
:This relies on you not having funky directory names with newlines in their names.
With GNU
find
:These
find
commands will find all JPEG images under your home directory and print the names of the directories where they were found. Thesort -u
will take this list of directory names, sort it, and remove duplicates. The result will be written to the filejpeg_dirs.txt
in the current directory.Looking back at this in early 2021 (3.3 years later) I cringe a bit because my solution above, albeit not wrong per se, is a bit backwards. It also makes the obvious assumption about "nice filenames" (no newlines).
When you're using
find
to search for directories, don't search for regular files as I did above; actually search for directories. Once we have the directories, we can look in each of them and see if the is a file matching*.jpg
or*.JPG
(further filename suffixes are easy to add):This peeks into each directory from your home directory down and tries to expand the globbing pattern
*.@(jpg|JPG)
in each. This pattern, which also could have been written as two separate patterns,*.jpg
and*.JPG
, matches all the files that we're looking for. If one name matches, we assume that this is a directory that we want to output the name of. This will give false positives for directories that contain only sub directories with these suffixes.The shell options that we run our internal
bash
script with allows us to match hidden names (dotglob
), allows the globbing pattern to disappear completely if it doesn't match anything rather than remain unexpanded (nullglob
), and allows us the use of theksh
-inspired extended globbing pattern@(...|...)
.Using the
zsh
shell:This creates an array variable,
list
, that has the property that it only stores unique elements. It is initialized to the result of expanding a filename globbing pattern. The pattern matches all JPEG image files in or below the home directory, and the:h
at the end removes the actual filename from the generated pathnames. The.
makes the pattern only match regular files, andD
andN
acts likedotglob
andnullglob
inbash
.