I have several thousand files in one directory that I would like to collate in directories like so:
From this:
└── Files
├── AAA.mkv
├── AAA.nfo
├── AAA-picture.jpg
├── BBB.mp4
├── BBB.srt
├── BBB-clip.mp4
├── CCC.avi
├── CCC.srt
├── CCC-clip.mov
└── CCC.nfo
To this:
└── Files
├── AAA
│ ├── AAA.mkv
│ ├── AAA.nfo
│ └── AAA-picture.jpg
├── BBB
│ ├── BBB.mp4
│ ├── BBB.srt
│ └── BBB-clip.mp4
└── CCC
├── CCC.avi
├── CCC.srt
├── CCC-clip.mov
└── CCC.nfo
The file names vary in length and number of words, sometimes separated by spaces and possibly a few with hyphens (in addition to the ones ending '-short'. They are primarily video files with a variety of formats/containers: mov/mpg/mkv/mp4/avi/ogg. Some are subtitled. Some have files with associated metadata (.nfo or -clip)
Edit: The primary files are videos (this is where I would like to draw the directory name). The associated files represent metadata. Some different in naming by only the extension. There are a half-dozen other variations on the base filename like -clip.mp4 -clip.mov or -picture.jpg I figured if something were suggested with those few then I could (hopefully) work at figuring out the rest. In summary, AAA.mkv moves into a directory called AAA. Then all metadata files that begin with AAA join it (i.e., in this example: AAA-picture.jpg and AAA.nfo). So the basename is in fact a substring in the case of the AAA-picture.jpg file. I would say it is probably relatively safe to simply use the hyphen as the delimiting factor… though '-clip' or '-picture' in its entirety would be safer.
How can I do this without getting carpal tunnel syndrome?
I looked at this but it was sufficiently different that my weak scripting abilities fizzled.
Thank you.
Best Answer
While your question is tagged with
bash
, this would be somewhat troublesome ( in my humble opinion ) to usebash
for such task. I'd suggest using python because it has a lot of good functions for complex tasks and this answer provides a solution using that language.Essentially what occurs here is that we use regex to split filenames at multiple delimiters, get only first part and use unique set of those first parts as basenames for new directories.
We then traverse the top directory again , and sort the files in their appropriate places.
The script doesn't do anything spectacular, and actually in algorithm analysis this wouldn't do too well, because of the nested for loops, but for "quick and dirty, yet workable" solution it's alright. If you are interested what each line does, there's plenty of comments added to explain the functionality
Note, the demo only shows printing of the new filenames for testing purpose only. Uncomment the
os.rename()
part to actually move the file.The Demo
Script itself
Additional notes:
re.split()
function): add inside square brackets ( meaning"[.-]"
) add whatever characters you want.os.rename()
function. Alternatively you couldimport shutil
and useshutil.move()
function. See https://stackoverflow.com/a/8858026/3701431